Skip to main content.
home | support | download

Back to List Archive

Re: INDEX_WORDS interpretation

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Mar 29 2004 - 16:04:37 GMT
On Mon, Mar 29, 2004 at 07:42:05AM -0800, Peter Karman wrote:
> I'm looking for help here from someone familiar with the format of the 
> index.
> 
> I'd like to determine the frequency of words in a document set, based on 
> the swish index.
> 
> I know that:
> 
> swish-e -T INDEX_WORDS
> 
> will give me a dump of all the words in the index. How do I read that 
> output?

That's why I added -T index_words_full ;)

I'll bet if you compare the two you can figure out all the numbers.

BTW - I added -T index_words_meta to make it easy to parse for building a
dictionary of words in the index (for use with Aspell).


-- 
Bill Moseley
moseley@hank.org
Received on Mon Mar 29 08:04:37 2004