Skip to main content.
home | support | download

Back to List Archive

index size versus searching for quoted string tradeoffs (possible

From: Bill Schell <friedfish(at)not-real.optonline.net>
Date: Wed Jun 16 2004 - 16:24:33 GMT
I just throughly confused myself by searching for a phrase ("Text of 
Report") that I knew
was in the documents I had just indexed.   I couldn't find it!  After 
some head scratching
I realized that the word 'of' is in the file cited in the IgnoreWords 
configuation directive.

If this confused me, it will *really* confuse my users, who know nothing 
about any
IgnoreWords file.   They would have to figure out the they should enter 
"Text Report",
although that is not what is in the document.   The only immediate fix I 
can think of for this
is to get rid of the IgnoreWords directive, which will make my indices 
bigger and slower to
search.

I'm wondering if a future version of swish-e should  remove words cited in
the IgnoreWords file from all search terms?  Or is the performance loss 
on removing the
IgnoreWords directive for a reasonable set of common english words not 
worth worrying
about?

Bill
Received on Wed Jun 16 16:24:37 2004