Skip to main content.
home | support | download

Back to List Archive

Re: Indexing .doc .ppt .xls with filters and prog method

From: Benoit Guguin <liste(at)not-real.alixen.fr>
Date: Fri Aug 19 2005 - 11:53:49 GMT
Ok thank you,

I Have tested with Dirtree.pl and it's works fine with xls, pdf and doc.

So I'm currently looking to add filter for  powerpoint and openoffice 
(sxi, sxw, sxc). But I don't understand the source code  :( ...

If someone already do this, can he give us the file please ?


Thanks again,

Regards,

Peter Karman a écrit :

>The .pm files:
>
>  doc2txt.pm
>  pdf2html.pm
>  pdf2xml.pm
>
>are example modules that predate (iirc) the SWISH::Filters class. The reason 
>pdf2html works in your script is this line in the pdf2html.pm file:
>
>   @EXPORT = qw(pdf2html);
>
>which tells Perl to make that function available in your script's namespace with 
>the 'use' function.
>
>I'd suggest using the DirTree.pl example script instead; it calls SWISH::Filter 
>for you correctly.
>
>Benoit Guguin scribbled on 8/19/05 4:45 AM:
>
>  
>
>>Hello,
>>
>>I try to index a directory with only pdf, doc, xls and ppt.
>>
>>
>>I've seen in version 2.5.4 some perl script to filter .ppt, .xls and .doc. 
>>
>>I try to use them  with the prog method but when I run swish-e ( 
>>"swish-e -c /etc/swish-e/swish.conf -S prog") I have thoses erros :
>>
>>Undefined subroutine &main::Doc2html called at /etc/swish-e/swish.pl 
>>line 55.
>>Or
>>Undefined subroutine &main::pp2hml called at /etc/swish-e/swish.pl
>>
>>The error depends of the order of the functions.
>>
>>
>>So I don't undestand  why it's work fine for pdf but not for others 
>>format...
>>
>>I'm looking around ml archive but dont find my St Graal;)
>>
>>Any idea please ?
>>
>>Regards,
>>
>>
>>    
>>
>
>  
>


-- 
Guguin Benoit
Société Alixen 2 rue Jean Rostand 91 893 Orsay Cedex France
Tel : 01 69 85 24 13, Fax : 01 69 85 24 10
Received on Fri Aug 19 04:53:51 2005