Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] PDF indexing won't work

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Sat Sep 05 2009 - 01:34:46 GMT
Christoph Lechner wrote on 9/4/09 5:55 PM:
> Hi all!
> 
> [swish-e version 2.4.5 (Debian 5.0 stable)]
> 
> I'm new to swish-e and trying to index PDF files.
> swish_filter.pl drops some error messages that I don't understand. The
> error messages are the same for any PDF file it tries to index.
> 
> spider@web-int:~/swish-e$ swish-e -S prog -c swish.conf
> Indexing Data Source: "External-Program"
> Indexing "spider.pl"
> External Program found: /usr/lib/swish-e/spider.pl
> /usr/lib/swish-e/spider.pl: Reading parameters from 'default'
> Processing http://kb/kb/tb/...
> Processing http://kb/kb/tb/ATmega16.pdf...
> Failed to set content type for document './swtmpfltraEpRfM'
> Can't return outside a subroutine at
> /usr/share/doc/swish-e/examples/filter-bin/swish_filter.pl line 55.
> 
> Warning: filter
> '/usr/share/doc/swish-e/examples/filter-bin/swish_filter.pl' exited with
> non-zero status: [255]
> 
> If I wget one of the PDF files and run the test program, the contents of
> the PDF is dumped to the console without error messages showing up:
> 
> /usr/share/doc/swish-e/examples/swish-filter-test --content --verbose
> d002X02.pdf
> 
> What's wrong?

send along your swish.conf and spider.pl config files.

I'm suspicious that spider.pl is calling swish_filter.pl at all. That seems
wrong. spider.pl should be using SWISH::Filter internally, not delegating to
swish_filter.pl.

also, you might try getting the latest version (2.4.7). 2.4.5 is now a few years
old.


-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Sep 4 21:34:47 2009