Skip to main content.
home | support | download

Back to List Archive

Is MinWordLimit getting applied correctly?

From: David Wood <dwood(at)not-real.inter.nl.net>
Date: Mon Apr 02 2001 - 01:14:59 GMT
Hi folks,

At the end of a run with 2.1-dev-20 with IndexReport set to 3, I get the 
following messages:

...
Removing very common words...
Warning: This proccess can take some time. For a faster one, use 
IgnoreWords instead of IgnoreLimit
371 words removed.
35 words removed not in common words array:
i, 1, b, c, d, e, f, g, h, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, 
z, 2,
  3, 4, 5, 6, 8, 7, 9, 0,
Writing main index...
...

But were the 35 listed non-common words really common in my content (we 
have IgnoreLimit set to 99 1000), or are they just below my 
MinWordLimit  setting of 2?  In other words, if we want to get optimal 
performance, do we need to avoid both IgnoreLimit _and_ MinWordLimit?  I 
would have thought that MinWordLimit would be getting applied during 
indexing so those words would never get in in the first place and therefore 
would not need to get removed.

cheers,

David 
Received on Mon Apr 2 01:20:39 2001