Rainer Scherg RTC wrote:
>I've made some enhancements to swish-e 1.1 to index Non-Text or HTML files
>(e.g. to get PDF-files indexed) [I've sent the code changes to Roy].
Could you describe the code changes? Do you directly index the PDF files?
To index PDF files, I implemented the following workaround:
1. For every PDF file (for example, "myfile.pdf"), create a file
"myfile.pdf.html" that contains the plain text to be indexed.
2. When the search engine returns a hit on a myfile.pdf.html, change the
reference to myfile.pdf.
This works for other filetypes, such as Word files, etc. The only
disadvantage is that you must create the separate HTML files.
Patrick Fitzgerald, HP Internet and System Security Lab
firstname.lastname@example.org -or- email@example.com
(do *not* use firstname.lastname@example.org, that is not me)
Received on Mon Aug 10 10:27:21 1998