Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] count number of times a word occurs in an

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Sat Dec 29 2007 - 05:32:29 GMT
On Fri, Dec 28, 2007 at 10:05:43PM -0600, Eric Jobidon wrote:
> index[resolved]
> 
> Is it appropriate to interpret the "position data" as a page number? So
> "(5/9)" would indicate that the word occurs (at least once) on page 5 of a
> nine page document? 

No, it's the word position in the document and the "structure" which
indicates where in a (html) file the word is found.

The word position isn't of much use relating to the source document.
Swish uses it for phrase matching.



$ cat 1
hello hello there

$ swish-e -T indexed_words -v0 -i 1
    Adding:[1:swishdefault(1)]   'hello'   Pos:5  Stuct:0x9 ( BODY FILE )
    Adding:[1:swishdefault(1)]   'hello'   Pos:6  Stuct:0x9 ( BODY FILE )
    Adding:[1:swishdefault(1)]   'there'   Pos:7  Stuct:0x9 ( BODY FILE )

$ swish-e -T index_words

-----> WORD INFO in index index.swish-e <-----

hello [1 1 2 (5/9 6/9)]

there [1 1 1 (7/9)]



$ swish-e -T index_words_full

-----> WORD INFO in index index.swish-e <-----

hello
 Meta:1 1 Freq:2 Pos/Struct:5/9,6/9

there
 Meta:1 1 Freq:1 Pos/Struct:7/9

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Sat Dec 29 00:14:35 2007