Skip to main content.
home | support | download

Back to List Archive

Re: Merging indexes

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Mon Jul 01 2002 - 16:33:10 GMT
At 09:20 AM 07/01/02 -0700, CBol wrote:
>I fixed up an index to an internel site, and since I use it over the net, I
>indexed using -S http. It worked fine, but it took 17 hours to build the
>index. The problem is that, once in a month, I have to include a few more
>pages in the index, and I tried to do it using the merge option, as in:
>
>swish-e -M newpages.index swish.index

-M inputfile_1 inputfile_2 inputfile_3 [...] output_index_file


>all I got is an error message:
>    err: Merge output file 'swish.index' already exists.  Won't overwrite.
>
>I also tried to merge both indexes to a new file, and it replied to me:
>    err: Header WordCharacters in index swish.index doesn't match output
>header

That should mean that the two indexes you are merging were not created with
the same config file settings.

>Is there a simple solution, or can someone give me a hint of the best
>direction?

How about for now when searching you do:

    ./swish-e -w $query -f index1 index2 index3 ...

Merge does not work well in 2.1-dev version, and is a current topic of the
developers.  It uses way too much memory.

Hopefully it will improve in the next month.  What you really need is
incremental indexing.

17 hours is a long time for indexing.  How many files were you indexing?


-- 
Bill Moseley
mailto:moseley@hank.org
Received on Mon Jul 1 16:36:48 2002