Skip to main content.
home | support | download

Back to List Archive

Re: A few newbie-questions ...

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Jul 20 2004 - 16:01:39 GMT
On Tue, Jul 20, 2004 at 03:10:40AM -0700, Volker wrote:
> But what I still did not understand: Each day I want to add a 
> MySQL-generated output to an existing index. that meanwhile does have a 
> size of  700MB.
> The scripts that retrieves information from the MySQL database and 
> "feeds" swish-e works fine.
> 
> But HOW can I add 10 new pages (my scripts feeds swish-e with html pages 
> located in a MySQL database like mentioned above) to an existing index file?

Basically, the normal response to that question is: swish-e doesn't
support incremental indexing.

Here's your options:

1) recreate you index when needed.  This doesn't work when pushing the
limits of swish by indexing a huge number of docs.

2) work out a system where you only reindex once a (day|week|month)
and then create a temporary index containing just docs added since
your last big indexing.  Then search:

   swish-e -w $query -f $main_index $files_since_main_was_indexed

3) do number 2 but merge the indexes.  That will be faster when
searching and sorting by properties other than rank.

4) do some variation of #2 or #3 and use more indexes -- useful for
indexing lots of files that change often

5) try the incremental indexing option in swish-e.  Run ./configure
--help to see how to build swish-e to support incremental indexing.
It won't be compatible with other versions of swish or other indexes,
and you can only *add* new files, not updated or remove existing ones.

6) consider if what you are indexing is so big that maybe swish-e
isn't the right tool.

Is this a FAQ?

-- 
Bill Moseley
moseley@hank.org
Received on Tue Jul 20 09:01:55 2004