Skip to main content.
home | support | download

Back to List Archive

Re: incrementing word position

From: <jmruiz(at)not-real.boe.es>
Date: Fri Sep 22 2000 - 14:26:37 GMT
Hi Bill,

On 22 Sep 2000, at 6:19, Bill Moseley wrote:

> At 01:20 AM 09/22/00 -0700, you wrote:
> >> Would that be very hard to implement?
> >> 
> >Not really. I was thinking in some config  directive like:
> >BumpPositionCounterCharacters  |-()
> 
> What about multiple characters or some way of saying bump on a period, but
> only if it's the end of a sentence?  So, 
> 
> "It was expensive.  The price was $5.24 at the local store."
>                  ^^                 ^                     ^
>                 Bump              No bump               Bump
> 
> Maybe BumpPositionCounterCharacters would need to be a subset of
> IgnoreLastChar?
> 

There is another posibility:
"It was expensive . The price" 

Here, there is a blank before and after the period. So we have at 
least 4 posibilities:

1- "word.  word"
2- "word.word"
3- "word . word"
4- "word .word"

And what about these ones
5- "word... word"

I am not sure if BumpPositionCounterCharacters needs to be a 
subset of IgnoreLastChar. I do not think so. But the check in the code 
has to consider that it can be part of IgnoreLastChar. 

Basically, in stripIgnoreLastChar function, if a char is stripped, we 
have to check also if the stripped character is in 
BumpPositionCounterCharacters. If so, counter is incremented. 

Adtionally, we have to increment pointer if the next non blank char 
after a word is in BumpPositionCounterCharacters.

What do you think?

> >
> >I have also been working in your lasts posts 
> >- Get StopWords (SwishStopWords function)
> 
> Anyway to get a switch to get the stopwords printed in the headers of the
> swish binary?
> 
Sure

cu
Jose
Received on Fri Sep 22 14:27:03 2000