You can have a look at the CGI that I wrote as part of my
Knowledge Mapper project.
Download the code from
The code is in the subdirectory called ac_search_cgi. There
is code available to filter PDF/DOC/HTML files in the
subdirectory ac_filter_docs. It uses pdftotext, catdoc and
lynx for filtering.
It relies on a text/html version of the file being
available in a specific directory, but it does highlight
Let me know how you get on.
On Mon, 15 Dec 2003 08:25:56 -0800 (PST)
Frank Naude <firstname.lastname@example.org> wrote:
> Does anybody have a script to highlight/colorize search
> phrases in HTML
> documents? HTML tags should be left intact. The script
> should be similar
> to Google's cached pages functionality.
> Any ideas on how to write such a script (in C or Perl)
> would also be
> highly appreciated.
> Best regards.
Look Good, Feel Good www.healthiest.co.za
Received on Thu Dec 18 07:39:14 2003