Skip to main content.
home | support | download

Back to List Archive

Re: incrementing word position

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Sep 22 2000 - 15:10:26 GMT
At 04:23 PM 09/22/00 +0200, jmruiz@boe.es wrote:
>I am not sure if BumpPositionCounterCharacters needs to be a 
>subset of IgnoreLastChar. I do not think so. But the check in the code 
>has to consider that it can be part of IgnoreLastChar. 

Right. I wasn't thinking that through.  The tricky part is saying bump at
the end of a sentence.  That means a period followed by a space, end of
line, or end of data.

>Basically, in stripIgnoreLastChar function, if a char is stripped, we 
>have to check also if the stripped character is in 
>BumpPositionCounterCharacters. If so, counter is incremented.

Ok, so that should take care of limiting phrases to sentences.  And that
would work for this:

    IgnoreLastCharacter ).,
    BumpPositionCharacters .|

    "(Two characters are stripped, and the bumped once.)"

The "." and ")" are stripped, and since at least one of those chars is a
bump char, then bump the position.

>Adtionally, we have to increment pointer if the next non blank char 
>after a word is in BumpPositionCounterCharacters.

So:  "this bumps|and so does this | one" bump the position at both pipes,
right?

Together, something like this would bump the position twice:

     "This is the end.  |Start of another sentence."
                     ^  ^
Two bumps.  But I don't see that as a problem.

Thanks,

Bill Moseley
mailto:moseley@hank.org
Received on Fri Sep 22 15:10:53 2000