In stemmer.c, function: EndsWithCVC
if ( (length = strlen(word)) < 2 )
return( FALSE );
should be:
if ( (length = strlen(word)) < 3 )
return( FALSE );
This routine is looking at the last three characters of a string, so it
makes sense to make sure there are at least three characters instead of two.
The error was causing EndsWithCVC to read off the beginning of a string and
resulting in the same word stemming differently.
Amazing how a ten year old routine could have such an obvious bug.
Bill Moseley
mailto:moseley@hank.org
Received on Tue Oct 19 08:45:08 1999