Skip to main content.
home | support | download

Back to List Archive

Re: Swish

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Sat Jun 02 2001 - 12:58:04 GMT
At 10:39 PM 06/01/01 -0700, Peter Kelly wrote:

Hi Peter,

I'm cc'ing the list.  That's the place to ask questions.

>When I run the index as a perl script, I see everything happening as
>normal, but once it finishes collecting all the words from these
>documents, the thing pauses forever on "removing common words."  I don't
>think the machine is running out of memory really.... running vmstat
>indicates memory levels are holding steady.

Are you using IgnoreLimit in your config file?  If so, try this:

http://sunsite.berkeley.edu:4444/SWISH-CONFIG.html#item_IgnoreLimit

I will note that Jose has since rewritten some of that code to make it much
faster in the 2.1 dev version.  Check the list archives for information on
2.1, but still, use IgnoreWords instead of IgnoreLimit.


>What's this about "FileFilter .doc "/usr/local/bin/catdoc" "-s8859-1
>-d8859-1 '%p'"" ??

That says for every file that ends in .doc run the catdoc program to
extract the text out of the Word document.



Bill Moseley
mailto:moseley@hank.org
Received on Sat Jun 2 12:59:42 2001