Skip to main content.
home | support | download

Back to List Archive

total words in a file

From: Paul J. Lucas <pjl(at)not-real.ptolemy.arc.nasa.gov>
Date: Sat May 15 1999 - 00:17:54 GMT
	If I'm not mistaken from looking at the SWISH-E source code, it
	seems as though the total number of words in a file (used in
	ranking calculations) is the total number of INDEXED words and
	not the total number of ACTUAL words in a file.

	If you use the number of indexed words, I think that would
	yield (erroneously?) higher ranks for words in a given document
	that if the number of actual words were used instead.

	1. Is this correct?

	2. If so, can a justification be given as to why the number of
	   indexed words should be used as opposed to the number of
	   actual words?

	- Paul
Received on Fri May 14 17:15:10 1999