Skip to main content.
home | support | download

Back to List Archive

multi language and stemming

From: Brad Miele <bmiele(at)not-real.auroraquanta.com>
Date: Fri Oct 25 2002 - 11:57:05 GMT
Hi,

I have been using swish-e with great success for some time now. Currently
we are using the prog method and XML to index our database of 80,000 image
records, and the indexing and searching are fast and consistent.

We are about to begin our first spanish language site, with german to
follow, and I am wondering if anyone has experience with using alternate
stemming options.

I have found snowball, http://snowball.tartarus.org/ and my current plan
is to take my incoming index data and pre stem it in the prog portion of
the indexing, then put it in a stem_metaname field for indexing.

My question is really whether this s the best way to go about it. Has
anyone come across, or built multi language stemmers that can replace the
existing swish stemmer? Any experiencial information would be appreciated.

Brad
------------------------------------------------------------
 Brad Miele
 Chief Technology Officer
 Aurora & Quanta Productions
 bmiele@auroraquanta.com
 (207)828-8787 x110

FreeBSD -- because rebooting is for adding new hardware!
Received on Fri Oct 25 12:00:38 2002