Skip to main content.
home | support | download

Back to List Archive

Indexing other document types with SWISH::Filter

From: andy rosbrook <andy_rosbrook(at)>
Date: Sun Aug 20 2006 - 20:28:10 GMT
Hi all, just a quick question, ive been reading the docs with regards to the SWISH::Filter and, i've tired out the following to test indexing a pdf doc with the following command:

swish-filter-test foo.pdf foo.txt

i get the following result:

Document foo.pdf was  filtered.
   Document:     foo.pdf  (foo.pdf)
   Content-Type: text/html
   Parser type:  HTML*

   >Filter used: SWISH::Filters::Pdf2HTML=HASH(0x9dd70f0) ( application/pdf -> text/html )
** /usr/local/bin/swish-filter-test:
  Failed to open 'foo.txt': No such file or directory

Whats the problem here? I presume the document was filterd ok? 

On another note, is there anything that needs to be included in the spider config to get the SWISH::Filter working for pdf documents? Or is it automatic?

Be one of the first to try Windows Live Mail.
Received on Sun Aug 20 13:28:16 2006