Hello,
We use Swish-e for our service, Bloglines, and have been very happy with
it. Currently, it's indexing about a gigabyte of data, consisting of
about 2 million html "pages," but that's rapidly expanding. Right now, a
full index is taking around 3 hours on our hardware. It's not memory
constrained. What can I do going forward to deal with the increasing
data? An immediate desire is to take advantage of more than one
processor, so I was thinking of just spliting the data over multiple
index files. Is there any advantage to merging indexes into one big file
instead of just using several smaller files? Is there a performance hit
on searching multiple files? Anything else I should be considering?
Thanks,
Mark
--
Mark Fletcher
Bloglines
http://www.bloglines.com
Received on Mon Sep 8 04:03:55 2003