Skip to main content.
home | support | download

Back to List Archive

Re: swish-e -D index.swish

From: <jmruiz(at)not-real.boe.es>
Date: Tue Sep 19 2000 - 15:31:10 GMT
Hi Bas

On 19 Sep 2000, at 5:23, Bas Meijer wrote:

> Hi,
> 
> 
> Swish-e 1.3.x has a -D flag for decompressing indexfiles to stdout.
> A lot of numbers pass and lines with somthing like this format:
> word: num num num num ...
> 
> Is this flag still supported in 2.0.1? (I still need to upgrade 
> lookup to that version).
> 
> Does anyone know what these numbers mean? I hope they can 
be usefull 
> for an idea i have in post-processing the index. At this point I 
have 
> only made dictionary.cgi which allows you to browse the words 
in the 
> index in an alphabetical way and search with them with 
lookup.cgi.
>
Yes, it is still supported. Now, it gives more info:

Try swish-e -v 4 -D test.index

Ignore OFSETS INFO and HASHOFFSETS INFO. I use them for 
debugging.
You will see something like this for the words (WORD INFO part):

myword: Meta:1 ./test_meta.html Rank:5800 Strct:7 Freq:2 Pos:3 
15

This means that the word "myword" is on MetaName 1 (No 
MetaName), in file ./test_meta.html, has a rank of 5800, a 
structure of 7 (like in 1.3.X). The frequency is 2 and the positions of 
the word in the file are 3 and 15.

The same info without -v 4 will look like:

myword: 1 8 5800 7 2 3 15

8 is the filenumber.

BTW, I use the positions for implementing phrase search.

Of course, for a word you can have several sets of this info. In this 
case, the information is sorted by metaname, filenumber.

All the words are sorted.
 
If you need more info let me know.

cu
Jose
Received on Tue Sep 19 15:31:28 2000