Skip to main content.
home | support | download

Back to List Archive

performance aspects

From: swishe <swishe(at)>
Date: Thu May 27 2004 - 07:31:06 GMT
Hi swish-e developers,

we are using swish-e 2.4.2 on a linux server.
We have built a large index containing in about 1 million XML records.
Normally search results are presented in far below a second.

But we've noticed the following behaviour:
When doing a boolean search like "key1=value1 AND key2=value2" it seems that
swish-e is doing the key2=value2 search even if the first part of
this query (key1=value1) does not find any results.
Would it be possible to fix this?
If yes, we could place the "critical" search terms like phrase search
or right-truncated strings on the right side after the uncritical 
search terms. Most programming languages behaves like this when evaluating
conditions.  In our application requests like 
    key1=<string> AND key2=<1 or 2 chars>*
are possible.
A "full table scan" looking for words beginning with 1 or 2 specific chars
takes a long time. Our word index contains in about 30 million items.
It would be perfect if in this example the search for the second part
could be reduced to records found by the first condition "key1=<string>". 
Or in other words: it would be perfect if swish-e could JOIN results from 
left to right. 
Or again in other words: do you use a query optimizer and if so, how
does it work?

My second question concerns another performance aspect:
We are using properties. So, swish-e generates two big files,
index.swish-e and index.swish-e.prop
In our application index.swish-e.prop is twice as big as index.swish-e.
Would it speed up the search if we would copy one of these files
or both into a ramdisk?

Thanks a lot in advance.

Best regards, Uwe Dierolf

Uwe Dierolf                       Tel  0721/608-6076
University Library of Karlsruhe   Fax  0721/608-4886
Postfach 6920                     76049 Karlsruhe / Germany
Received on Thu May 27 00:31:08 2004