I was suggesting that the -v3 option would tell you if swish-e was in
fact parsing swish_test.pdf or if somehow it was being passed something
different. I just tried your example here and it worked for me, so I was
suggesting a way for you to start to debug what's going on.
Gertjan Hofman scribbled on 6/30/06 3:59 PM:
>
> Peter -
>
> Not sure I understand - I am passing only 1 file -
> swish_test.pdf (as indiced in the config file I
> enclosed). Of course I started with entire folders
> but for sake of demonstration of the problem only
> parse the one file
>
> I note there are older messages in the mailing list
> with similar sounding problems - in that case
> spider.pl failed from a config file but worked in a
> pipe...
>
> Thanks
>
> Gertjan
>
>
> --- Peter Karman <peter@peknet.com> wrote:
>
>>
>> Gertjan Hofman scribbled on 6/29/06 11:59 PM:
>>
>>> TRY 1: USING CONFIG FILE
>>>
>>> gertjan-laptop:~/tmp/swish_test> swish-e -S prog
>> -c
>>> swish_file.conf
>>> Indexing Data Source: "External-Program"
>>> Indexing "./DirTree.pl"
>>> External Program found: ./DirTree.pl
>>> Error: May not be a PDF file (continuing anyway)
>>> Error (0): PDF file is damaged - attempting to
>>> reconstruct xref table...
>>> Error: Couldn't find trailer dictionary
>>> Error: Couldn't read xref table
>>> Removing very common words...
>>> no words removed.
>>> Writing main index...
>>> err: No unique words indexed!
>>>
>>
>> add the -v3 option to get more verbose. That should
>> tell you the name of
>> the file being parsed with SWISH::Filter (xpdf). I'm
>> betting the file
>> isn't getting passed correctly.
>>
>> --
>> Peter Karman . http://peknet.com/ .
>> peter@peknet.com
>>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Fri Jun 30 14:30:45 2006