Skip to main content.
home | support | download

Back to List Archive

(no subject)

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Jun 22 2004 - 17:46:35 GMT
On Tue, Jun 22, 2004 at 08:47:14AM -0700, Bill Pavich wrote:
> Unless you can suggest
> otherwise, I didn't see where I could remove files from an (the master
> in this case) index and then just remerge that one specific day's index
> back into the master, is there?

Well kind of.

When merging if two files have the same path name it then looks at the
last modified date and uses the newest one.  Note "Replaced file" below:

    moseley@bumby:~$ echo "first one" > test
    moseley@bumby:~$ swish-e -i test -f 1.idx -v0

    moseley@bumby:~$ echo "updated file" > test
    moseley@bumby:~$ swish-e -i test -f 2.idx -v0
    
    moseley@bumby:~$ swish-e -M 1.idx 2.idx out.idx -T indexed_words
    Input index '1.idx' has 1 files and 2 words
    Input index '2.idx' has 1 files and 2 words
    Replaced file 'test 2004-06-22 10:37:21 PDT' with 'test 2004-06-22 10:37:42 PDT'
    Getting words in index '1.idx':      2 words
    Getting words in index '2.idx':      2 words
        Adding:[1:swishdefault(1)]   'updated'   Pos:5  Stuct:0x9 ( BODY FILE )
        Adding:[1:swishdefault(1)]   'file'   Pos:6  Stuct:0x9 ( BODY FILE )
    Processing words in index 'out.idx':      4 words
    Removed      2 words no longer present in docs for index 'out.idx'
    Writing main index...
    Sorting words ...
    Sorting 2 words alphabetically
    Writing header ...
    Writing index entries ...
      Writing word text: Complete
      Writing word hash: Complete
      Writing word data: Complete
    2 unique words indexed.
    4 properties sorted.                                              
    1 file indexed.  0 total bytes.  2 total words.
    Elapsed time: 00:00:00 CPU time: 00:00:00
    Indexing done!

How well that works in your case (speed wise) is another question.
Merging is just indexing without having to parse the input files.
Otherwise, it goes through all the same steps.  It's not true
incremental indexing -- because it's not just removing one file and
adding in another.



-- 
Bill Moseley
moseley@hank.org
Received on Tue Jun 22 17:46:39 2004