> My sysadmin won't let me pass wildcards through the CGI script.
That is not an entirely bad idea depending on the web server's config.
> What's the general opinion of people who've been using swish for a
long
> time? Does stemming pay off?
I've not noticed any major ill effects of stemming on my system. It
generally works quite well. I added a minimum word length to my
stemmer.c, per Bill Moseley's suggestion I believe. That seems to
have clipped a number of false or inaccurate results lists. Short
words don't stem well and often look like something that should be
stripped. So you might end up with a 0 character word after stemming.
> Is it too slow
I would call it anything but slow. Total running time for my entire
script including forking SWISH is rarely more than a few hundred
milliseconds. It is almost instantaneous.
> that confuse the lo-tech users?
I added a bit of code to the output parser in my search script that
looks for "Stemming Applied: 1" and prints a note indicating that
stemming is active on that particular index. I did the same for my
Soundex module, as well. That way a search of multiple indices will
automatically indicate which options are active.
> does it return false hits
It could potentially do that. I'd be just as concerned with it not
matching words which it should. It is a trade-off, it will make a few
mistakes. You just don't want it to match or miss too many words
either way.
http://www.webaugur.com/search/
--
,David Norris
The OpenSA Project - http://www.opensa.de/
Dave's Web - http://www.webaugur.com/dave/
ICQ Universal Internet Number - 412039
E-Mail - dave@webaugur.com
Received on Fri Nov 19 13:26:07 1999