Skip to main content.
home | support | download

Back to List Archive

Config problems using catdoc on linux/debian

From: Thomas Nyman <thomas(at)not-real.teg.pp.se>
Date: Tue Apr 12 2005 - 11:00:07 GMT
Hi

I have installed swish-e 2.4.3

I wish to index MS-Word documents primarily.

I have the following config file

IndexDir /usr/local/arkiv/
IndexOnly .doc .txt
IndexContents TXT .txt .doc
StoreDescription HTML <body> 200000
StoreDescription TXT 256
MetaNames swishdocpath swishtitle
FileFilter .doc /usr/bin/catdoc "-s8859-1 -d8859-1'%p'"
ReplaceRules remove /usr/local/arkiv/

My problem is i can only do a serach on document path. I am not getting
any hits on the content of the word files. I receive no error messages.

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 33 words alphabetically
Writing header ...
Writing index entries ...
    Writing word text: Complete
    Writing word hash: Complete
    Writing word data: Complete
33 unique words indexed.
5 properties sorted.
8 files indexed.  680 total bytes.  127 total words.

Why am i unable to search on the contents of the files? What am i
missing.

I tried switching to swish::filters but kept getting an error saying it
was not loaded. But i was unable to figure out how to "load"
SWISH::Filter

Thanks
Received on Tue Apr 12 04:00:09 2005