Skip to main content.
home | support | download

Back to List Archive

Re: Indexing of word documents, stored on a UNIX

From: FISHER,JOSEPH (Non-HP-Roseville,ex1) <joseph_fisher(at)>
Date: Fri Aug 17 2001 - 21:57:18 GMT
Hi Bill,

Ok, I understand that I need to include a filter file in order to index the
contents of MS Word documents stored on a Unix system... (As I understand
it, this was NOT necessary under SWISH 1.3...)

I've downloaded and compiled "catdoc"... Catdoc is even referenced in one of
the filter files under SWISH-E 2.1...


I've installed it in it's default location, and made sure that the filter
file is pointing to the correct directory structure...

But which configuration file should I modify to make SWISH-E sees this MS
Word filter file?

Thanks in advance,

Joe Fisher

-----Original Message-----
From: Bill Moseley []
Sent: Friday, August 17, 2001 12:04
To: Multiple recipients of list
Subject: [SWISH-E] Re: Indexing of word documents, stored on a UNIX

At 11:31 AM 08/17/01 -0700, FISHER,JOSEPH (Non-HP-Roseville,ex1) wrote:
>When I index the documents, everything appears to go through just fine,
>the following exceptions:
>	1) I get a warning message for each file being indexed:
>		Warning: Possible embedded null in file

Well, without seeing your config, I don't know.  To index Word documents you
need to use a filter (or add filtering to your program if indexing with -S

Don't use a shell or perl script to call catdoc -- rather call catdoc
directly as shown in the example.   The scripts will kill your indexing

Bill Moseley
Received on Fri Aug 17 22:22:55 2001