Skip to main content.
home | support | download

Back to List Archive

Re: Unexpected index file size reduction

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Sep 26 2002 - 18:42:49 GMT
At 09:50 AM 09/26/02 -0700, Lauren Landsburg wrote:
>I've been using swish-e for a number of years to index a site that has
grown to close to over 4000 pages.
>
>Suddenly, with the latest site additions a week or so ago, instead of
increasing a few meg, the index file size dropped from 35.9 meg to 25.5 meg.  

Well, I hope that doesn't mean that anything is left out!

Which index file?  index.swish-e or index.swish-e.prop?  If the .prop file
is the one that is smaller that may be due to zlib compression.

Jose is constantly working to compress the index file, so although I can't
remember a specific change, it's possible you are seeing the results of his
efforts.


>The index files produced by swish-e had previously regularly increased in
size with my additions to the website.  It takes about 2 hours to index the
site using the http method.

Two hours seems like a long time to fetch 4000 files.  I suppose you have a
delay to keep from hitting your server too hard.

If you use the spider.pl and the keep_alive feature then you should be able
to spider much faster without much load on the server (depending on your
available bandwidth, of course).


-- 
Bill Moseley
mailto:moseley@hank.org
Received on Thu Sep 26 18:47:44 2002