I tried like you said and now I am getting some of these:
22865 Warning - /home/shared/Accounting/Capital/Update Capital 7-7-04.xls:
Character in 'c' format wrapped in pack at
/usr/lib/perl5/vendor_perl/5.8.6/Spreadsheet/ParseExcel.pm line 1790.
Error: Bad annotation action
Failed to set content type for document
Examiner Playground Article 12-13-02.mht'
Bad BBD entry!
Broken OLE file. Try using -b switchFailed to set content type for
Do those matter?
Also does the default SWISH::Filter install know about powerpoint files
too? I looked in /usr/lib/swish-e/perl/SWISH/Filters but I only see files
that seem to reference ms word, ms excel, pdf, and mp3. I see that ms
powerpoint is advertised on your web page as being supported, but there
doesn't seem to be much mention of it.
> Nick scribbled on 5/6/05 3:49 PM:
>> swish-e -c /etc/swish.conf -S prog -i DirTree.pl
>> I tried that but I got this:
>> Indexing Data Source: "External-Program"
>> Indexing "DirTree.pl"
>> External Program found: /usr/lib/swish-e/DirTree.pl
>> Must supply at least one directory
>> DirTree.pl [options] directory <directory...> | swish-e -S prog -i
>> -verbose Display processing info
>> -debug Enable debugging (including SWISH::Filter
>> -man Display documentation
>> -path Display location lib path set at installation
>> -no_skip Process documents even if filtering fails
>> -symlinks Follow symbolic links. Default is to NOT follow
>> Removing very common words...
>> no words removed.
>> Writing main index...
>> err: No unique words indexed!
> try adding this line to your existing config:
> SwishProgParameters /home/shared
> and comment out this line:
> # IndexDir "/home/shared"
>> Is there any reason to use SWISH::Filter for performance, or is it just
>> supposed to be easier? To me doing something like this in the config
>> makes more sense, as I understand what it is doing when I tell it about
>> each type of file:
> I think you're right, in principle. You must be a sysadmin-type: we tend
> not to
> like the black box approach. ;)
> SWISH::Filter lets you drop in new filters and, in theory, not change your
> config. But doing it longhand like you have it should work too. Unless it
>> IndexContents TXT* .txt
>> IndexContents HTML* .htm
>> IndexContents HTML* .html
>> FileFilter .pdf pdftotext "'%p' -"
>> IndexContents TXT* .pdf
>> FileFilter .doc catdoc
>> IndexContents TXT* .doc
>> FileFilter .ppt ppthtml
>> IndexContents TXT* .ppt
>> But of course I have something wrong in there since I am getting lots of
>> errors from catdoc, and also I don't know how to put the excel one in
>> there since I think it is a perl script.
> Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Fri May 6 14:13:45 2005