Skip to main content.
home | support | download

Back to List Archive

Re: RE: stemming

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Sun Nov 21 1999 - 23:20:30 GMT
At 09:05 AM 11/19/99 -0800, SRE wrote:
>>You can find out more in the Swish list archive, but you should know that
>>wild card searches don't work as expected with Stemming.
>
>My sysadmin won't let me pass wildcards through the CGI script.
>The form is set up to strip all that stuff, which is why I turned
>stemming on in the first place (poor man's wild card?).

You sysadmin has the right idea, but might be missing the mark.  The
examples you might have seen for pipe-opened Swish in Perl is a bit scary.
If you are writing a front-end in Perl I'd suggest doing a fork/exec to
swish as suggested in perldoc perlipc "Safe Pipe Opens."  And run your
scripts with -T always.


>What's the general opinion of people who've been using swish for a long
>time? Does stemming pay off? Is it too slow, does it return false hits
>that confuse the lo-tech users?

Good question.  I don't know what you mean by false hits.  Users may be
surprised to search for rockies and find all your links to "Rocky and
Bullwinkle."  For smallish databases and lo-tech users I would think
stemming would be much better.  They may not be savvy enough to search for
different versions of a word.  I'd rather error on too many results than
not returning something that might be related to their search.

>Thanks for the tip. If I ever enable wild cards, I'll turn off stemming.

Or patch the source!


Have fun,




Bill Moseley
mailto:moseley@hank.org
Received on Sun Nov 21 15:21:26 1999