Hi,
I use swish-e_2.4.2 and I've problem with the pdf files.
After launch of $ ./swish-e -Sprog -c swish.conf this error is in the
output and the crawler go on :
...
Error: Couldn't find cidToUnicode file for the 'Adobe-WinCharSetFFFF' collection
Error: Unknown character collection 'Adobe-WinCharSetFFFF'
Error: Unknown font tag 'R137'
Error: May not be a PDF file (continuing anyway)
Error (0): PDF file is damaged - attempting to reconstruct xref table...
Error: Couldn't find trailer dictionary
Error: Couldn't read xref table
http://www.di.unipi.it/sindacati/21set2004.pdf - Using HTML2 parser - (no
words indexed)
...
I use pdftotext for filter the pdf file.
And the configuration in the swish.conf is :
FileFilter .pdf pdftotext " '%p' -"
IndexContents TXT2 .pdf
Which is the problem ?
Thanks in advance .
Cheers Andrea
Received on Tue Dec 14 02:52:15 2004