In August last year I wrote a message in this eMail-list
that I´ve done some enhancements which enable swish (1.1) to index
non-HTML files like PDF or other documents types (filter option).
Since then I got occasionally requests how to do this and where to
get the source. Due to the requests I'm adapting the small enhancements
to swish-e 1.3.2.
If there is a public interest, I would try to get a small webspace
to provide the source - instead of sending it via email on each request.
To describe the changes to swhis in short:
new config directives:
FileFilter <file-ext> <filterprog>
FileFilter .pdf pdf-filter.sh
FileFilter .doc ms-wword-filter.sh
FileFilter .ps ps-filter.sh
FileFilter .gz gzip-filter.sh
e.g. pdf-filter.sh - script:
# Convert file in arg1 to txt on stdout
/usr/local/bin/pdftotext "$1" - 2>/dev/null
Received on Fri May 7 10:37:38 1999