On Tue, Jul 22, 2003 at 04:38:13PM -0700, Erik Lyons wrote:
> After several weeks of exclaiming joyful praise to the initial "S" in
> SWISH, I stumbled across the example quoted below. It runs and reports
> "PDF transformed: 2,009 (19.7/sec)", but no PDF files can be
> returned in any search results. As an added bonus, all document titles
> that are in the search results appear as "(NULL)". Are these problems
> related, or do I have 2 different gleaming horizons of delight to
> explore?
Hard to say, but probably not hard to debug.
Edit the spider's config file to point to a single PDF file. Then just
run the spider like:
spider.pl your_config_file.name > test.html
and look at test.html and make sure it has a title and content.
Then you can index that one PDF with:
cat test.html | swish-e -c your_config -S prog -i stdin -T properties
the -T properties will show you if the title is being stored.
--
Bill Moseley
moseley@hank.org
Received on Wed Jul 23 02:08:55 2003