Skip to main content.
home | support | download

Back to List Archive

[swish-e] incremental index format produces *much* larger index files?

From: Judith Retief <JudithR(at)not-real.inet.co.za>
Date: Tue Nov 20 2007 - 09:54:19 GMT
I'm trying to get incremental indexing working - will be trying out the 2.6
BDB version soon but would like to see if I can get the current version
working in the interim.

I've built the swish-e binary using --incremental-index, and used it to
index 40 000 articles (< 64K each) in two runs: first using both incremental
and then standard merge indexing, ie index 50 items at a time and merging it
into a master.

Both worked very well; no crashes, no instability or anything, and they
produce the same search results. 

However, my index files differ hugely in size: the merged index files add up
to about 80M, the incremental index files almost 600M! What's going on?

Is there anything that I could be doing wrong to be generating these huge
files? 

I know not many people are using incremental indexing, and now that a
totally new implementation is almost done there isn't much motivation to
investigate the abandoned incremental index implementation. But for our
purposes incremental indexing would be very much preferable to the merge
indexing so I'd be very very happy if I could get it to work, eventually
replacing with version 2.6.


_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Tue Nov 20 04:54:32 2007