good morning everybody
i've been successfully running swish-e on a windows machine for several
years now. a new customer now insists on the indexing of pdf files. ok,
i went at it, installed the newest 2.47 (with all necessary plugins) and
tried. but it does not work. here is how i do it:
- the config file as in the attachment www.allpedes.ch.conf
- running the spider with the also attached allpedes.bat (the following
merger step with the swish-e.exe is currently not included because of
the initial problems described below). the file has been renamed to
allpedes.txt due to antispam stuff with gmail.
- the resulting www.allpedes.ch_spider.txt can be viewed here:
interestingly, there are NO pdf files indexed / dumped into the txt
file, but they are there on the web site
(http://www.allpedes.ch/de_kataloge.cfm?kid=all for example - and the <a
href>'s are easily found in the txt-file when searching for '.pdf'.
what am i missing or doing wrong?
working with conf files in that format has worked for me for years and i
simply removed the |pdf on the filter list on line 6.
thanks for some help :)
nextron internet team GmbH
Reinacherstrasse 129, CH-4053 Basel
Tel: +41 61 695 92 25 / Fax: +41 61 695 92 21
Users mailing list
Received on Thu May 27 03:27:31 2010