Skip to main content.
home | support | download

Back to List Archive

Re: Stemming with Windows

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Jul 31 2002 - 19:29:41 GMT
At 11:06 AM 07/31/02 -0700, Erik.Pugh@bpd.treas.gov wrote:
>In the precompiled release of swish-e 2.1 for Windows, is Stemming built 
>into the executable?  If so...........

Yes.

>We also have a need to deploy swish-e on a Windows machine, I would like 
>to have the exact same set of features on both machines, but am
>currently stuck at the point where I implement stemming.  I have 
>everything else working on Windows except for stemming, and was wondering
>if anyone has any solutions or suggestions on how to compile or use the 
>SWISH::Stemmer perl module on windows.  I know the Stemmer that
>is included in the swish-e dev windows release needs to be compiled first. 

Hum, I'm not sure. If you have a C compiler you should be able to build the
module the normal perl way (perl Makefile.PL & make & make install), but I
suspect you would need a compiler and libraries that are compatible with
your installed Perl binary.

I don't think creating the ppm is hard, you just need a current compiler
for Windows, which I don't have.  I doubt my version of qc will work.  

Sorry for the wrap but:

http://aspn.activestate.com//ASPN/Reference/Products/ActivePerl/faq/ActivePe
rl-faq2.html#how_to_make_ppm_distribution

I assume the reason you need the stemmer module is to do term highlighting
with the swish.cgi script?

>Additionally, why does the Stemmer.pm module need 
>to be compiled?  Is there a 100% perl - alternative module 
>that would work?

Two reasons.  First, the stemmer used for highlighting text (e.g.
converting text to stemmed words) must match exactly the stemmer used in
swish when indexing.  Otherwise you might end up stemming the same word
into two different stems.  Second, speed.

>I do understand that swish-e builds the index files according to how the 
>binary was built, so if the binary doesn't know how to build a stemmed 
>index, then I 
>realize stemming won't work when you query the files.

No, it doesn't depend on how the binary was build. The stemmer module is a
standard part of swish.  You have to enable stemming in the config file for
it to build a stemmed index.


-- 
Bill Moseley
mailto:moseley@hank.org
Received on Wed Jul 31 19:33:11 2002