Skip to main content.
home | support | download

Back to List Archive

RE: Index file structure

From: <jmruiz(at)not-real.boe.es>
Date: Thu May 03 2001 - 11:57:22 GMT
Hi,

As Rainer points, the file format is changing everyday. It is not an 
easy format because it contains many unrelated information in just 
one file:
- Header info
- Invert index (used in wildcard searchs)
- Hash Index (used in direct searchs)
- Words data (includes frequency, positions, etc)
- Document data (including properties)
- Properties/Metanames information
- Presorted loookuptables for properties

Many of this info is also compressed. Also the file is "portable" (no 
big-endian/little-endian problems with numbers).

Use -D to see the contents of the index file. Eg:

swish-e -D file.index

or

swish-e -v 4 -D file.index

cu
Jose

On 1 May 2001, at 13:14, Rainer.Scherg@rexroth.de wrote:

> Jose is the expert here.
> 
> But the index file format may change in futdure versions.
> This is also because there are plans to have the choice to
> several database modules to store the index.
> 
> E.g.:  swish indexfile
>        sql database  (oracle, mysql, ??)
>        ...
> 
> cu - rainer
> 
> 
> > -----Original Message-----
> > From: Prosper Correa [mailto:prosper@correa.org]
> > Sent: Tuesday, May 01, 2001 7:34 PM
> > To: Multiple recipients of list
> > Subject: [SWISH-E] Index file structure
> > 
> > 
> > Where can I found a descriptionof the index file structure ?
> > 
> > Thanx in advance,
> > 
> > Prosper Correa
> 
> 
> ----------------------------------------------------------------------
> This Mail has been checked for Viruses Attention: Encrypted Mails can
> NOT be checked !
> 
> * * *
> 
> Diese Mail wurde auf Viren ueberprueft
> Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
> ----------------------------------------------------------------------
> 
Received on Thu May 3 11:57:51 2001