Skip to main content.
home | support | download

Back to List Archive

Bug in stemmer.c

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Oct 19 1999 - 15:44:06 GMT
In stemmer.c, function: EndsWithCVC

   if ( (length = strlen(word)) < 2 )
       return( FALSE );

should be:

   if ( (length = strlen(word)) < 3 )
       return( FALSE );

This routine is looking at the last three characters of a string, so it
makes sense to make sure there are at least three characters instead of two.

The error was causing EndsWithCVC to read off the beginning of a string and
resulting in the same word stemming differently.

Amazing how a ten year old routine could have such an obvious bug.


Bill Moseley
mailto:moseley@hank.org
Received on Tue Oct 19 08:45:08 1999