Skip to main content.
home | support | download

Back to List Archive

Swish-e poor performance?

From: Craig A Summerhill <craig(at)not-real.cni.org>
Date: Fri Aug 28 1998 - 05:27:52 GMT
On Thu, 27 Aug 1998, Lars Kellogg-Stedman <lars@bu.edu> wrote:
> 
> I have a fairly large collection of email (over 100MB in 20,000 messages),
> and finding a piece of information -- esp. one several months old -- can
> be challenging.  In the past I've used glimpse to index the mail, but I
> wanted to give Swish-e a try (easier search syntax, and it looked as
> though it was under more active development).
> 
> I had to give up, because after 18 hours it was still churning away.  On
> this same collection, glimpse took approx. 3 hours.  Is swish-e's
> performance *really* this poor?  Are there any faster alternative out
> there?

Lars, 

I'm not sure what the problem is, but there is definately a problem in 
your setup.  Sounds to me like it is infinitely looping somehow...

I have one mailing list which I archive which roughly approximates what 
you are doing.  It has 19,044 messages since 1992 constituting 115Mb of 
data (prior to indexing).  On an Alpha 1000/300 with Digital Unix, it 
takes less than two minutes to index.

Our entire Web site is a couple of gigabytes of data.  I index it in 
logical pieces (about 20 separate indexes), but I regenerate each index
on a daily basis using a cron script in the wee hours of the morning.
Generating all 20 or so indexes takes less than 30 minutes as near as 
I can tell from the time stamps on the e-mail confirmations.
-- 

   Craig A. Summerhill, Systems Coordinator and Program Officer
   Coalition for Networked Information
   21 Dupont Circle, N.W., Washington, D.C.   20036
   Internet: craig@cni.org   AT&Tnet (202) 296-5098
Received on Thu Aug 27 22:39:56 1998