Skip to main content.
home | support | download

Back to List Archive

Re: Ignore question

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Feb 13 2003 - 15:13:02 GMT
On Thu, 13 Feb 2003, Gentile, Jeff wrote:

> I am using SWISH to search a knowledge base (read: text files) for my
> support department that has a cgi/perl front end...  all html is
> within the script.

As your first post you should know that I ramble on about on and off-topic
things. So...

Think about moving to something like apache/mod_perl + mason (if you think
page-centric is the way to go) or + Template-Toolkit (which I feel is more
data / code driven).  Or even PHP for quick development and much faster
processing than perl/CGI.


> The main page is a image-mapped flow chart, each
> box leading to a "leaf" page pertaining to that (sub)category. Each
> leaf has various description fields that are associated with the
> category by filename.

Also sounds like you need a real database instead.


> Question:
> 
> I am trying to get SWISH to Ignore the header (first 5 lines) of the
> tech notes. However, even if there was a feature that was the reverse
> of "TruncateDocSize" to allow me to skip the first 5 lines, that
> wouldn't work, because of the "description" files that do not have
> this header and are associated by name.

Take a look at the prog-bin directory in the distribution.  What you do is
write a simple program that reads and parses your tech notes and only
passes to swish the data you want indexed.  prog-bin/DirTree.pl is a very
simple example.


Back to my off-topic comments, if you were using a templating system like
Template-Toolkit that separates the code from the output you would already
have (or maybe you do) a module that fetches and parses the notes into a
nice perl data structure.  That module could be used with your
presentation layer (using Template-Toolkit) to generate HTML, or text, or
a form for editing, or by a small program to use while indexing with
swish-e.



-- 
Bill Moseley moseley@hank.org
Received on Thu Feb 13 15:13:44 2003