I'm interesed too with this patch, because in one
intranet or in Internet too you can have a lot of document
they aren't HTML and .txt. So I think it's an interesting
idea to put it in the WWW.
University of the Basque Country
> In August last year I wrote a message in this eMail-list
> that I´ve done some enhancements which enable swish (1.1) to index
> non-HTML files like PDF or other documents types (filter option).
> Since then I got occasionally requests how to do this and where to
> get the source. Due to the requests I'm adapting the small enhancements
> to swish-e 1.3.2.
> If there is a public interest, I would try to get a small webspace
> to provide the source - instead of sending it via email on each request.
> To describe the changes to swhis in short:
> new config directives:
> FilterDir <path-to-filter-progs>
> FileFilter <file-ext> <filterprog>
> FilterDir /usr/local/etc/httpd/sbin/filters
> FileFilter .pdf pdf-filter.sh
> FileFilter .doc ms-wword-filter.sh
> FileFilter .ps ps-filter.sh
> FileFilter .gz gzip-filter.sh
> e.g. pdf-filter.sh - script:
> # Convert file in arg1 to txt on stdout
> /usr/local/bin/pdftotext "$1" - 2>/dev/null
> Regards Rainer
Received on Mon May 10 01:55:16 1999