On Tue, Jun 22, 2004 at 08:47:14AM -0700, Bill Pavich wrote:
> Unless you can suggest
> otherwise, I didn't see where I could remove files from an (the master
> in this case) index and then just remerge that one specific day's index
> back into the master, is there?
Well kind of.
When merging if two files have the same path name it then looks at the
last modified date and uses the newest one. Note "Replaced file" below:
moseley@bumby:~$ echo "first one" > test
moseley@bumby:~$ swish-e -i test -f 1.idx -v0
moseley@bumby:~$ echo "updated file" > test
moseley@bumby:~$ swish-e -i test -f 2.idx -v0
moseley@bumby:~$ swish-e -M 1.idx 2.idx out.idx -T indexed_words
Input index '1.idx' has 1 files and 2 words
Input index '2.idx' has 1 files and 2 words
Replaced file 'test 2004-06-22 10:37:21 PDT' with 'test 2004-06-22 10:37:42 PDT'
Getting words in index '1.idx': 2 words
Getting words in index '2.idx': 2 words
Adding:[1:swishdefault(1)] 'updated' Pos:5 Stuct:0x9 ( BODY FILE )
Adding:[1:swishdefault(1)] 'file' Pos:6 Stuct:0x9 ( BODY FILE )
Processing words in index 'out.idx': 4 words
Removed 2 words no longer present in docs for index 'out.idx'
Writing main index...
Sorting words ...
Sorting 2 words alphabetically
Writing header ...
Writing index entries ...
Writing word text: Complete
Writing word hash: Complete
Writing word data: Complete
2 unique words indexed.
4 properties sorted.
1 file indexed. 0 total bytes. 2 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
How well that works in your case (speed wise) is another question.
Merging is just indexing without having to parse the input files.
Otherwise, it goes through all the same steps. It's not true
incremental indexing -- because it's not just removing one file and
adding in another.
Received on Tue Jun 22 17:46:39 2004