Skip to main content.
home | support | download

Back to List Archive

Re: Swish-E 2.0 and PDF indexing

From: Jeffrey Grunstein <JEFFREY.GRUNSTEIN(at)not-real.ny.frb.org>
Date: Thu Jan 04 2001 - 21:32:07 GMT
I'm not sure about the paging - I'll find out.
I have both IgnoreLimit and IgnoreWords commented out.

I'll try recompiling on a 2.6 machine.

Do you have any idea when 2.2 will be out?

>>> <jmruiz@boe.es> 01/04 1:17 PM >>>

Hi Jeffrey,

Is your Solaris paging? 2.0 uses more memory than 1.3
because of the phrase search. We have fixed this problem
in 2.1, but 2.1 is an alpha state (will be 2.2 when stable).

Do you have IgnoreLimit set? This is very odd because
of the phrase search (for each word found swish needs to
recompute all words positions in all files). Remove 
it and try again. Using IgnoreWords instead of IgnoreLimit 
is a better idea.

Another issue. I remembered somebody in the list with 
problems with a swishe compiled for solaris 8. He got
a binary compiled in a 2.6 box to fix the problem.

cu
Jose

On 4 Jan 2001, at 9:22, Jeffrey Grunstein wrote:

> I just upgraded from Swish-E 1.32 to 2.0 and am having a performance
> problem indexing PDFs. It took almost 19 hours to index my site (-S fs
> option) with 3220 files. Many are PDFs but I don't have an exact
> count.
> 
> With Swish-E 1.3 (running in production now - I'm testing 2.0), the
> same index takes about 90 minutes. As far as I know, I'm indexing PDFs
> with 1.3 also but how can I tell for sure whether I am.
> 
> Can anyone explain why it takes so much longer with 2.0 than 1.3? I'm
> running this on a Sun Enterprise 450 with 4 Gigs of RAM, running
> Solaris 8.
> 
> 
> Thanks!
> 
Received on Thu Jan 4 21:35:33 2001