Skip to main content.
home | support | download

Back to List Archive

Re: Search on metanames - internals and speed

From: Bill Moseley <moseley(at)>
Date: Thu Mar 31 2005 - 18:35:27 GMT
On Thu, Mar 31, 2005 at 10:20:20AM -0800, Brett Paden wrote:
> Does anyone know how metaname searches are done on a swish index?


> I have a largish swish index (around 1 gig) that performs quite well 
> except when doing searches that contain multiple metanames.  For 
> example:
> -w '(america OR clinton) AND (owner_id=xxx OR owner_id=yyyy OR owner_id=zzz OR ...)

Every search is a metaname search, so it has noting to do with that.

> But with anywhere from 10 to 100 metaname=key strung together with ORs.

My guess is a search of 100 ORs queries would take somewhere near 100
times longer than a single one.

Each individual word is a query to the index that builds up a list of
results.  Then that result is either ORed or ANDed with the existing
list to make a new list.  When all done the entire list is sorted and
then results returned.  Sounds rather linear, doesn't it?  Now, that's
not taking into consideration what the OS might be doing to buffer the
disk reads.

> Also, I've noticed that repeating the query speeds results 
> dramatically.

You are running the swish-e binary or the C/Perl interface?

> I assume that swish stores some portion of the index in 
> memory as slightly modifying the query slows result time.

Thank your OS for that.

> Is there a way to force swish to store the entire index in memory
> before any querries are done?

Like a RAM disk?  I think I'd trust the OS to buffer the best.
Using the C or Perl interface (SWISH::API) will help keep buffers in
memory since the index remains open between requests.  Also saves the
overhead of forking (minor) and opening and parsing the header each

Bill Moseley

Unsubscribe from or help with the swish-e list:

Help with Swish-e:
Received on Thu Mar 31 10:35:28 2005