Skip to main content.
home | support | download

Back to List Archive

Re: Error Message: Index file error: Could not open

From: Kaplan, Andrew H. <AHKAPLAN(at)not-real.PARTNERS.ORG>
Date: Thu Jun 10 2004 - 13:50:40 GMT
I tried adding the FileFilter line to the swish.conf file, and unfortunately
that only made things worse. 
The problem that I am seeing is that while the indexing appears to be working
fine, when I do a search for
the files via swish.cgi, the only results that are listed are the index.swish
files. I have not made any
changes to the SWISH::Filter file, which I assume is
/usr/local/lib/swish-e/perl/SWISH/Filter.pm. What
is going on, and what do I need to do to correct this?

-----Original Message-----
From: swish-e@sunsite3.berkeley.edu
[mailto:swish-e@sunsite3.berkeley.edu]On Behalf Of Bill Moseley
Sent: Wednesday, June 09, 2004 8:02 PM
To: Multiple recipients of list
Subject: [SWISH-E] Re: Error Message: Index file error: Could not open


On Wed, Jun 09, 2004 at 02:15:56PM -0700, Peter Karman wrote:
> I believe a simple FileFilter config line will work, though it is slower 
> than the SWISH::Filter module (Bill, correct me on this):
> 
> FileFilter .pdf       pdftotext   "'%p' -"

Only if not using spider.pl's default config.  The default config in
spider.pl automatically filters pdf files (if xpdf programs are found in
the path).

By default I mean passing "default <url>" to spider.pl -- the "default"
tells the spider to use a built-in config.  Look at spider.pl in an
editor to see that config -- and how it uses SWISH::Filter.

Otherwise, if you don't pass a parameter to spider.pl it will look for
SwishSpiderConfig.pl (IIRC).  The example SwishSpiderConfig.pl file also
has examples of how to use SWISH::Filter.

Basically, you default a content filter in spider.pl that passes the
content and the content-type to SWISH::Filter.

That make sense?

-- 
Bill Moseley
moseley@hank.org
Received on Thu Jun 10 13:50:57 2004