Hi,
As Rainer points, the file format is changing everyday. It is not an
easy format because it contains many unrelated information in just
one file:
- Header info
- Invert index (used in wildcard searchs)
- Hash Index (used in direct searchs)
- Words data (includes frequency, positions, etc)
- Document data (including properties)
- Properties/Metanames information
- Presorted loookuptables for properties
Many of this info is also compressed. Also the file is "portable" (no
big-endian/little-endian problems with numbers).
Use -D to see the contents of the index file. Eg:
swish-e -D file.index
or
swish-e -v 4 -D file.index
cu
Jose
On 1 May 2001, at 13:14, Rainer.Scherg@rexroth.de wrote:
> Jose is the expert here.
>
> But the index file format may change in futdure versions.
> This is also because there are plans to have the choice to
> several database modules to store the index.
>
> E.g.: swish indexfile
> sql database (oracle, mysql, ??)
> ...
>
> cu - rainer
>
>
> > -----Original Message-----
> > From: Prosper Correa [mailto:prosper@correa.org]
> > Sent: Tuesday, May 01, 2001 7:34 PM
> > To: Multiple recipients of list
> > Subject: [SWISH-E] Index file structure
> >
> >
> > Where can I found a descriptionof the index file structure ?
> >
> > Thanx in advance,
> >
> > Prosper Correa
>
>
> ----------------------------------------------------------------------
> This Mail has been checked for Viruses Attention: Encrypted Mails can
> NOT be checked !
>
> * * *
>
> Diese Mail wurde auf Viren ueberprueft
> Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
> ----------------------------------------------------------------------
>
Received on Thu May 3 11:57:51 2001