With the help of this discussion group, Swish-e is working beautifully on
our intranet. I am now attempting to get the pdf files indexed, and after
reading the documentation and previous pdf discussions, I am thoroughly
confused about how to go about it.
I am using a win 2000 machine and I think I need to use xpdf's pdftotext
filter. We have not installed Perl on our machine.
My swish-e config file is:
# Configuration file for LowCoNet Intranet Procedures
# This is the name of the index file
#Index the files in this folder
IndexDir "d:/LowCoNet Intranet Files/Procedures"
#Remove this part of the path. It will be replaced with
#the URL by the php interface config file
ReplaceRules remove "d:/LowCoNet Intranet Files/"
#Only index files ending in .htm .html .pdf
IndexOnly .htm .html .pdf .txt
IndexContents TXT2 .pdf .txt .doc
MetaNames swishdocpath swishtitle
PropertyNames description author keywords
#Don't Index files with ~
FileRules pathname contains ~
#Assign the pdftotext filter to .pdf files
FileFilter .pdf c:/xpdf/pdftotext.exe '"%p"-'
I have installed xpdf and edited the sample.xpdfrc
I run the following from my command line
swish-e -c procedures.cfg -s prog
I get a good index of the html files, but for each pdf file swish finds, I
get an "error:couldn't open file myfilename.pdf. As I am not in any way a
programmer, I am getting more lost the more I try to trouble shoot this
issue. Can anyone set me out on a better path?
Received on Mon Jun 9 14:42:32 2003