Re: Ranking question

From: Lauren Landsburg <lauren(at)>
Date: Wed Dec 04 2002 - 14:44:47 GMT
Bill Moseley wrote, quoting David VanHook:

DvH> > My question is, if I want to adjust those on a currently live, running
> > version of SWISH, do I have to recompile SWISH?  Or do these variables exist
> > someplace else where I can adjust them on the fly?
BM> Currently you must recompile.
BM> I don't see why they couldn't be configurable at indexing time -- or for
> that matter why they couldn't be adjusted at search time -- other than
> it's time coding something that not many people would use.

I'd like to weigh in with a vote requesting the option to make this
configurable at either indexing time or search time.  I believe our website
users _would_ use this option.  Or, if not the users, then at least I'd like
the option during indexing to
change the weights between, say, the upcoming MetaNamesRank and page titles.  

Your defaults may be great!  But I can envision that I might like to fine tune

> The word frequency is biased by the "structure" which is a flag that says
> where in the html the word is found (e.g. in <title> or <h1>).  For

Including <h2> through <h6>?  I'm curious.  Are there different weights by
heading, or the same weights for all headings?  My apologies for asking you
rather than combing through the compilable file.

Your current compiled defaults work extremely well for most searches on our
well-used website (we average 9000 unique visitors a week, and over 200
effective searches a day ["effective" meaning: I've removed from the raw sample
trivial errors such as
failing to type in any keywords at all]).  

However, we're serving up whole books online for research purposes.  Many
individual web pages are relatively long, containing a chapter or even several
short chapters.  The ability to make adjustments at search time would allow our
users to select
to emphasize or de-emphasize items in, say, chapter titles and subtitles.
Although only the more sophisticated users would take advantage of such an
option, I can easily see having _some_ control at search time very useful in
bringing specific books
to the top of what may be a long list for any given keyword.

>From an earlier email (Sun, 1 Dec 2002, [SWISH-E] MetaNamesRank (was: Multiple
property values)):

BM> For example, maybe words that are in the
> first 100 words of the document should be ranked higher.

I would vote against this unless it can be overridden at indexing time.  

For the website I've been talking about, the beginnings of every page are
nearly identical, offering up basic information about the website and the book.
Using frames to deliver this repetitious information is not desired by our
users, nor would it
accomplish my client's goals.  

Plus, the content of the second chapter on a page is equally important to that
of a first chapter on the page.

Obviously, you are not writing Swish-e for use on this one website!  But
perhaps thinking about how it is used on this site can offer an example to keep
in mind.

Thanks for this interesting discussion topic.

