Thanks for your response. This is exactly the answer that I was looking for...
The DirTree.pl is almost exactly what I would use, slightly modified to process only
my technote directory, and provides me an easy way to handle the file attachments that
I allow, including pdfs nicely enough, and exclude those I can't index, like data captures.
Funny enough, I spent some time before starting this project under it's current architecture
looking at a db, mod_perl, mason, templating, or even PHP. I didn't go these routes for a
couple reasons... I didn't know perl a month ago, and wanted to learn it for it's other
uses besides cgi/php type applications. Mod_perl and mason seemed like more then I needed
to learn initially, and would slow my go live date, although I may end up there in the future.
I didn't go with templating since it seemed easier for me to pick a style, develop, and
template later if I wanted to change the interface. The html is quite simple (1 table
format, and some forms) and only spiffed up by a CSS.
That all being said, I will likely be going to a database before any of these options,
however, right now, believe it or not, the 107 technotes that are applicable to our new
product set that this is for take up a whopping 270K...lol... So I guess I have some time
before I need to look at putting everything in the database, and handling the attachments
in some serialized fashion...
Next phase is to learn GraphViz to get my front-end flowchart dynamic so I can allow
folks to do child add/delete functions...
Again, thanks for your help, It looks like I'm off to the races now.
From: Bill Moseley [mailto:email@example.com]
Sent: Thursday, February 13, 2003 10:13
To: Multiple recipients of list
Subject: [SWISH-E] Re: Ignore question
On Thu, 13 Feb 2003, Gentile, Jeff wrote:
> I am using SWISH to search a knowledge base (read: text files) for my
> support department that has a cgi/perl front end... all html is
> within the script.
As your first post you should know that I ramble on about on and off-topic
Think about moving to something like apache/mod_perl + mason (if you think
page-centric is the way to go) or + Template-Toolkit (which I feel is more
data / code driven). Or even PHP for quick development and much faster
processing than perl/CGI.
> The main page is a image-mapped flow chart, each
> box leading to a "leaf" page pertaining to that (sub)category. Each
> leaf has various description fields that are associated with the
> category by filename.
Also sounds like you need a real database instead.
> I am trying to get SWISH to Ignore the header (first 5 lines) of the
> tech notes. However, even if there was a feature that was the reverse
> of "TruncateDocSize" to allow me to skip the first 5 lines, that
> wouldn't work, because of the "description" files that do not have
> this header and are associated by name.
Take a look at the prog-bin directory in the distribution. What you do is
write a simple program that reads and parses your tech notes and only
passes to swish the data you want indexed. prog-bin/DirTree.pl is a very
Back to my off-topic comments, if you were using a templating system like
Template-Toolkit that separates the code from the output you would already
have (or maybe you do) a module that fetches and parses the notes into a
nice perl data structure. That module could be used with your
presentation layer (using Template-Toolkit) to generate HTML, or text, or
a form for editing, or by a small program to use while indexing with
Bill Moseley firstname.lastname@example.org
Received on Thu Feb 13 19:43:50 2003