Hello,
I had Swish-e running just fine indexing only htm and pdf files on an
intranet. I tried adding in pdf and doc index capabilities and the
indexing task (I have it run once/week) failed. My config file that it
uses is as follows (The new lines in my config file are preceded by an
asterisk, which I added for this email and are not part of the actual
file):
# The directory to index but exclude dir manlibtp and readme files
IndexDir /Data/webroot/docs
FileRules dirname contains manlibtp
*FileRules filename contains readme
#To enable indexing of doc and pdf files
*FileFilter .doc /Program Files/SWISH-E/lib/swish-e/catdoc.exe '-s8859-1
-d8859-1 "%p"'
*FileFilter .pdf /Program Files/SWISH-E/lib/swish-e/pdftotext.exe "%p -"
# Don't want to index .txt as they are mostly readme files
*IndexOnly .htm .html .pdf .doc
# How to process
IndexContents HTML .html .htm
*IndexContents TXT2 .pdf .txt .doc
# Allow searching by title, path
MetaNames swishtitle swishdocpath
# To output body text and enable highlighting
StoreDescription HTML* <content> 256
*StoreDescription TXT 256
# Replaces actual path with URL
ReplaceRules replace /Data/webroot/docs http://url
Sarah
Received on Mon Dec 20 10:08:54 2004