Skip to main content.
home | support | download

Back to List Archive

Re: Filter Word Files with -S fs indexing

From: <moseley(at)>
Date: Tue Aug 26 2003 - 15:24:11 GMT
On Tue, Aug 26, 2003 at 06:54:03AM -0700, Bucharow Leonard wrote:
> Hi Bill,
> sorry for  newbe questions, I hope you help me though:
> I'm trying now indexing file system, cause the is not greatly
> suitable for search in Intranet with Java-Plugin's, JavaScript and PHP
> dynamic sites.

Can't process javascript without a javascript interpeter, but for PHP 
I'd think you would want to spider instead of using the file system 

> Parsing PDF files works fine excepting few PDF files with
> error: "Bad annotation destination" or "Bad annotation action". I've read
> that comes from xpdf (pdfinfo or pdftotext). The xpdf help is unfortunately
> not huge. Do you know, what does it mean and what is at pdf files wrong?

No I don't, and google isn't much help.  I just set mail to the xpdf 
author, but you might also try asking on a group like comp.text.pdf.

> The second question:
> How can I filter MS Word files with -S fs indexing (if you have a solution
> for PowerPoint and Excel, it would be great)?

Here's some options:

1) use with SWISH::Filter

2) if you have a good reason not to spider (like files are not 
available on a web server) use the prog-bin/ example program 
and copy in the code from to use SWISH::Filter

3) try the filters listed at
and use a FileFilter directive.  I have not tried those filters.

Google might find other solutions.

Bill Moseley
Received on Tue Aug 26 15:25:51 2003