Skip to main content.
home | support | download

Back to List Archive

Re: swish-e on a large scale

From: Peter Karman <karman(at)not-real.cray.com>
Date: Thu Sep 30 2004 - 18:40:16 GMT
Aaron Levitt wrote on 09/30/2004 01:17 PM:

> How do I use the new btree database back-end?  Would this be better 
> than the default?
> 

 > On Sep 30, 2004, at 9:25 AM, Bill Moseley wrote:
>>Since you are indexing a mail archive (where old messages don't
>>change) then you should try building swish with the
>>--enable-incremental option.  And then you can *add* files to the
>>index as needed.  It still requires some of the normal processing (like
>>presorting all the records) but should be faster that reindexing.
> 

you can tell us whether it's better. It does allow for incremental 
indexing (as the name suggests) and has been tested by a few folks, but 
I'm sure Jose would love to see it tested on the scale you're attempting.

As far as using mysql, just be clear that you can't use mysql as a 
backend for swish-e per se. You could spider docs and add their content 
to a mysql db, then output that data to swishe for faster searching than 
mysql provides. But swishe can only use its index (either traditional or 
the btree --incremental format) to store data.

spider -> swishindex
or
spider -> mysql -> swishindex

Seems obvious to me that unless you want to do something with the data 
besides search it, the mysql step is superfluous.

Just wanted to make sure the mysql relationship is clear.

-- 
Peter Karman - 651-605-9009 - karman@cray.com
Received on Thu Sep 30 11:40:28 2004