Skip to main content.
home | support | download

Back to List Archive

Re: PHRASE search: More about stopwords

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu May 11 2000 - 18:41:58 GMT
At 10:32 AM 05/11/00 -0700, Jose Manuel Ruiz wrote:
>Some things about phrase search and stopwords that
>will be implemented in next beta.
>
>Since there are several opinions about stopwords
>and word position I think the this may be a solution
>for all: A config file option to enable or disable 
>position increasing when stopwords are in a phrase. 
>In the second case, indexing will be slower because positions
>must be recalculated when automatic stopwords are found.
>
>Words shorter than minwordlength are also added to stopwords
>list. In fact, they are like stopwords.

Am I correct that if you specify stopwords in the config file such that no
additional (automatic) stopwords are found during indexing that swish will
not need to reposition?  In other words, will indexing speed be the same if
not automatic stopwords are found?


>Also, same rules applied to words when indexing
>will be also applied when searching: IgnoreFirstChar,
>IgnorelastChar, etc.

Fantastic!

>But, something comes to my mind:
>what to do in file merge if this rules are not
>identical? Any ideas?

>Reject merge? Merge wordchars and issue a warning? 

That seems reasonable.  Wouldn't want to break existing merge scripts.



Bill Moseley
mailto:moseley@hank.org
Received on Thu May 11 14:45:24 2000