On 3 Aug 2000, at 0:53, Bas Meijer wrote:
> We all know that Swish-e needs lots of memory. Time however is not
> mentioned much. Normally we use modern machines, now however we used
> an old 486DX 33Mhz with 16Mb, 200 Mb swap.
> We installed a full RedHat 6.2 and indexed all the documentation in /usr/doc
> some 3640 files with 69738 words, we started the indexing may 12 2000.
> We went for a cup of coffee.
> We went for lunch.
> We continued working on other stuff
> We went home for the weekend
> We worked another week
> We went on vacation for three weeks
> We waited some more...
> Yesterday august 2 swish-e finally wrote its index file of 8391488 bytes.
> Maybe there is room for some speed optimization ;-)
If you are not using swish-e-2.0, try it. It is much more faster
in both indexing and searching mode.
Anyway, I am afraid that the main problem is the small amount of
Swish-e stores all the words and the file info in memory while it
is indexing. So, I am afraid that your old 486 with just 16 MB
of RAM is paging a lot. Take a look at your swish-e proccess while
indexing (eg. with top tool) to see the amount of memory in use.
So, probably, your CPU is waiting for I/O instead of computing
A workaround. Search does not need so much memory. So,
you can index your data in a better machine and then move the
index file to your old 486. With swish-e 2.0 index files should be
portable!! (if not you have found a bug).
Another performance issue:
IgnoreLimit option can make your index proccess slower, specially
with 2.0 because it has to recompute all the word's positions and
counters at the end of the index process. Use IgnoreWords instead.
Received on Thu Aug 3 06:01:35 2000