Bill Moseley wrote:
> I'm trying to understand the difference with the old swish.
> The old swish used to take a search like "search for run*" and then lookup
> all the words in the index that started with run* and convert it int a
> search of
> "search for (run or runs or running)" and then use that for the query. (I
> don't remember how it then handled that query, though.)
> Does 2.0 also first expand "search for run*" into "search for (run or runs
> or running)"?
No, 2.0 keeps the search as "search for run*". No expand is made.
Let's compare both versions ...
Old swish-e works as follows
get all words (expandstar) ---> this gives (run or runs or running)
for each word in "run runs running"
result =getfileinfo(word) ---> This gives one result list per word
for all result ----> "or" all results
As you can see, file index is read several times: Sequentially in expandstar to
get all the words and sequentially, for each word, once again, in getfileinfo
until it finds each word's info!!!
So, if you just have three words starting with r* ...
To read "run" info, old swish-e just reads "run"
To read "running" info, old swish-e reads: "run" and "running" (and discards
To read "runs" info, old swish-e reads: "run", "running" and "runs" (and
"run" and "running").
You can see how terrible is this for performance if you search for "r*".
swish-e 2.0 just does the following:
So, no "orresults" is made. But in the other hand, the logic of getfileinfo is
more complex: It checks for '*'. If it is found, it gets all the words info at
adding it to a unique result list!!
If there is not an '*' in the word, swish-e-2.0 uses a hash search to find the
hash(word) ----> entry in index
In a normal word search (no '*'), old swish-e uses a sequential approach
after locating the start of the word using the first letter of the word. To find
To get "running" info, old swish-e reads: "ra" .... "run" and "running" (and
"ra" ... "run"). So in the worst scenary, if you have 400 words starting with
and the word searched is the last one, you have to read all the remaining 399
to get the data of just one searched word!!!
Well, it is a little bit complex but much faster!!
BTW, I will apply the fix and will make available soon the new beta.
Received on Wed Jun 28 11:24:50 2000