At 08:30 AM 07/01/02 -0700, Andrew Lord wrote:
>A general description of the script as a whole (when working correctly):
Instead of editing your script I wrote a new one. I think it work somewhat
similar to they way yours does:
It parses a directory for swish config files and extracts out the index
file name and the "IndexName", a description of the index.
You can select (one or more) indexes from a list, and click on a letter and
see all the words starting with that letter in the associated index(es).
Those words would then link to a CGI script to list what files they are
One difference, I think, is that this script allows you to (optionally)
also select more than one index at the same time.
(this link will not be around very long)
http://hank.org:5000/search/letters.cgi.text for source:
I did not format the HTML much, left as an exercise. The word links don't
go any place -- they would normally go to a search CGI script. Yet another
The current script you sent does not use -w or use strict, so I wanted to
create something a bit cleaner. I think this script is far from perfect
(or probably far from bug free???).
In the original script it looks like the word links included the index file
name. I just don't like passing around that kind of info in URLs --
specifically I worry that some other script will use the file names
directly, which can be a security problem. Instead I create unique index
IDs for each swish-e index that are passed around in URLs.
That means that more than one script (letters.cgi and the search script)
must be able to translate those IDs into real file names.
So, there's a chunk of code in the program (search for "package" in
letters.cgi) that should instead be used as a separate perl module that
would be use'd by both the letters.cgi script and any associated cgi
script. That module provides a common way to parse the index config files,
and to convert the index IDs into index file names.
I frankly don't like the design that parses the index config files every
request. That data is static for the most part so should only be parsed
when something changes. It would be reasonably easy to use the Storeable
module and then only re-parse the indexes when one was modified or changed.
Storable will save/restore perl data structures on disk. Caching is the
answer to many questions.
Along the same lines, the word lists probably should be cached as static
pages, too, but swish-e is reasonably fast at returning those lists. It's
better to turn dynamic content into static than the other way around.
The script tries to somewhat resemble MVC programming. It's not that
successful, but, hey, I wrote it in about an hour.
- The idea is that the Controller (the C) is the only think that knows
about CGI -- it converts the request into data that the Model can work with
(in this basic example there's only two CGI params -- not a very complex
- The Model just returns perl data structure. It knows how to take a list
of indexes and a letter, and return a list of words found in a swish index
but knows nothing about CGI or HTML. In theory, you should be able to
replace the Model to use, say MySQL or DBM file that has the list of words
without changing anything else in the script.
- Then the View is the only thing that works with HTML. In this example
I'm using Template-Toolkit for the View. You could use perl subroutines
with <<HERE docs if you wanted, but I like TT. The template is at the
bottom of the script -- normally you would have the template stored on disk
(and perhaps cached in pre-compiled format for speed).
Templates are a good thing to just use all the time, even for small
projects. It really makes life easier. In this example the associated
search script (or your entire site) might use templates, so you could
easily add this script to your site and have it look exactly like all of
your other pages on your site.
Hum, what else:
- Oh, I'd put the configuration into a separate file and use "do $config"
to load from disk.
- I used warn() in a lot of places where die() would be better. Better to
see something failing and fix it. Examples are when a swish config file is
missing IndexFile or IndexDir.
- Like I said, this was quickly written, so it may have bugs or some ugly
bit of code that could be better written (suggestions welcome).
- email me directly for questions about the script as Perl programming is
not really on topic for the swish-e list.
Received on Tue Jul 2 15:01:40 2002