Skip to main content.
home | support | download

Back to List Archive

RE: questions about searching with swish-e

From: David Norris <kg9ae(at)>
Date: Thu Aug 26 1999 - 15:10:55 GMT
>	1) when we give more than one index file to swish-e, the search
> Why aren't all these results merged?

I haven't a clue why it was originally done like this.  I smell a
performance issue somewhere.  Regardless, I consider it a feature instead of
a bug.  You can easily create a separate merged index out of the smaller
indices.  In many situations it would save one from being required to
execute the indexer multiple times with queries on different indices.  For
instance, a hierarchical search where relevance is important within
individual, separate indices.  (Similar to YAHOO.)  Forking another process
takes the most time of anything.  Your suggestion could benefit index
creation and lessen storage space at some (potentially quite noticeable)
cost to runtime performance.  I'd rather spend a few minutes creating
another index than spending 1 second longer performing a query.

>	2) the -m option allows to specify the maximum number of results
> But, as -m option seems to take the first X files in the result list,
> when we search in more than one file, files with high relevance rank
> can be ignored if they dont appear in the first index file.
> Shouldn't this be modified?

I agree, this is broken.  This has caused me a bit of trouble, as well.  I
didn't originally use it in my script, but, I was a little speed crazy.
(Hey, I gained a millisecond or two at runtime...)  I think, perhaps, it
should return either the first X results for each index or evenly split the
results between the indices.  The former is probably less of a headache and
more intuitive.

I've decided to stop using it since the search script can handle it better
and nearly as fast.  The addition of that option was debated quite a bit.
It does simplify the search script slightly in some cases.

,David Norris

World Wide Web -
Page via mail -
ICQ Universal Internet Number - 412039
E-Mail -
Received on Thu Aug 26 08:01:58 1999