Skip to main content.
home | support | download

Back to List Archive

PHRASE search: More about stopwords

From: Jose Manuel Ruiz <jmruiz(at)not-real.boe.es>
Date: Thu May 11 2000 - 17:32:44 GMT
Hi all,

Some things about phrase search and stopwords that
will be implemented in next beta.

Since there are several opinions about stopwords
and word position I think the this may be a solution
for all: A config file option to enable or disable 
position increasing when stopwords are in a phrase. 
In the second case, indexing will be slower because positions
must be recalculated when automatic stopwords are found.

Words shorter than minwordlength are also added to stopwords
list. In fact, they are like stopwords.

Also, same rules applied to words when indexing
will be also applied when searching: IgnoreFirstChar,
IgnorelastChar, etc. But, something comes to my mind:
what to do in file merge if this rules are not
identical? Any ideas?

Merging files: Now I am checking a bug found by Andrew Linn.

Some more doubts about merge:
What to do if wordchars or rules are not the same?
Reject merge? Merge wordchars and issue a warning? 
What to do with some values like minwordchar?
Take the highest value of minwordchar? etc. 
Any comments?

Jose Ruiz

jmruiz@boe.es
Received on Thu May 11 13:37:24 2000