Skip to main content.
home | support | download

Back to List Archive

Re: index pdf files with

From: Erik Lyons <ELyons(at)>
Date: Wed Jul 23 2003 - 14:55:47 GMT
Thanks Bill,

Run this way, appears to expect perl, so given the "f.conf" 
example (list of directives) it fails in a bountiful blossom of syntax

>>> Bill Moseley <> 07/22/03 07:07PM >>>
On Tue, Jul 22, 2003 at 04:38:13PM -0700, Erik Lyons wrote:
> After several weeks of exclaiming joyful praise to the initial "S"
> SWISH, I stumbled across the example quoted below. It runs and
> "PDF transformed:      2,009  (19.7/sec)", but no PDF files can be
> returned in any search results. As an added bonus, all document
> that are in the search results appear as "(NULL)". Are these
> related, or do I have 2 different gleaming horizons of delight to
> explore?

Hard to say, but probably not hard to debug.

Edit the spider's config file to point to a single PDF file.  Then just

run the spider like: > test.html

and look at test.html and make sure it has a title and content.

Then you can index that one PDF with:

   cat test.html | swish-e -c your_config -S prog -i stdin -T

the -T properties will show you if the title is being stored.

Bill Moseley 
Received on Wed Jul 23 14:57:07 2003