Skip to main content.
home | support | download

Back to List Archive

Re: Error Message: Index file error: Could not open

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Jun 10 2004 - 00:02:20 GMT
On Wed, Jun 09, 2004 at 02:15:56PM -0700, Peter Karman wrote:
> I believe a simple FileFilter config line will work, though it is slower 
> than the SWISH::Filter module (Bill, correct me on this):
> 
> FileFilter .pdf       pdftotext   "'%p' -"

Only if not using spider.pl's default config.  The default config in
spider.pl automatically filters pdf files (if xpdf programs are found in
the path).

By default I mean passing "default <url>" to spider.pl -- the "default"
tells the spider to use a built-in config.  Look at spider.pl in an
editor to see that config -- and how it uses SWISH::Filter.

Otherwise, if you don't pass a parameter to spider.pl it will look for
SwishSpiderConfig.pl (IIRC).  The example SwishSpiderConfig.pl file also
has examples of how to use SWISH::Filter.

Basically, you default a content filter in spider.pl that passes the
content and the content-type to SWISH::Filter.

That make sense?

-- 
Bill Moseley
moseley@hank.org
Received on Thu Jun 10 00:03:15 2004