Skip to main content.
home | support | download

Back to List Archive

Re: Incremental Indexing when spidering

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Jun 07 2001 - 17:34:08 GMT
At 10:15 AM 06/07/01 -0700, Bill Moseley wrote:
>At 09:56 AM 06/07/01 -0700, Gavin Walker wrote:
>>So the question is "Can -S http and -N file be used together?".
>
>Nope.  The swishspider program doesn't add the last modified date to the
>index.  The plan was to fix that at some point... a long time ago.
>
>If you spider with -S prog then the last modification date does get added
>correctly and then you can use -N.  The spider.pl that's in the development
>version can be used for this.  

Correction:

Ah, well, if use -S prog and the spider you would not use -N.  You would
just adjust the spider to not index anything older than a given date.  (No
point sending swish content just to let swish not index it, of course....).

-N is really for -S fs method.

perldoc spider.pl and search for "no_index".  It's a flag you can set when
validating the results from a fetched URL.





Bill Moseley
mailto:moseley@hank.org
Received on Thu Jun 7 17:47:47 2001