> Your suggestion could benefit index
> creation and lessen storage space at some (potentially quite noticeable)
> cost to runtime performance. I'd rather spend a few minutes creating
> another index than spending 1 second longer performing a query.
I am working on newsgroups indexes. My indexes are already merged indexes who are quite big (1 index by week, all week indexes merged into a year index).
When I want to make my search in more than one newsgroup index or in more than one year index, I have to use -f option with the list of the required indexes.
As there are too much possibilities, I cant do my search on only 1 index which is the merged index from the X selected newsgroups.
So if the all the results were merged, it would be very useful for myself and for a lot of other swish-e users who want to perform search in few indexes that cant be merged.
> I agree, this is broken. This has caused me a bit of trouble, as well. I
> didn't originally use it in my script, but, I was a little speed crazy.
> (Hey, I gained a millisecond or two at runtime...) I think, perhaps, it
> should return either the first X results for each index or evenly split the
> results between the indices. The former is probably less of a headache and
> more intuitive.
I agree that current swish-e should return the first X results of each index.
Or with merged results, the first X results would be those with the higher relevance rank.
IMHO, another useful feature would be the ability for swish-e to update an index. Adding some files can already be done by creating an index for the files and merging it with the other index.
But removing some files from an index would be a great thing and perhaps, not too hard to implement for swish-e mainteners. I imagine that it should be done by removing all the references to this file in the index, and then remove all the words that are no more referenced in a file.
Last question, how does swish-e proceded to know if a word is in the file title?
In the index file, is there a special bit set in front of the word after the file reference or does it search afterward if the word is in the file title?
Received on Fri Aug 27 01:44:47 1999