Hi folks,
At the end of a run with 2.1-dev-20 with IndexReport set to 3, I get the
following messages:
...
Removing very common words...
Warning: This proccess can take some time. For a faster one, use
IgnoreWords instead of IgnoreLimit
371 words removed.
35 words removed not in common words array:
i, 1, b, c, d, e, f, g, h, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y,
z, 2,
3, 4, 5, 6, 8, 7, 9, 0,
Writing main index...
...
But were the 35 listed non-common words really common in my content (we
have IgnoreLimit set to 99 1000), or are they just below my
MinWordLimit setting of 2? In other words, if we want to get optimal
performance, do we need to avoid both IgnoreLimit _and_ MinWordLimit? I
would have thought that MinWordLimit would be getting applied during
indexing so those words would never get in in the first place and therefore
would not need to get removed.
cheers,
David
Received on Mon Apr 2 01:20:39 2001