I use the file system mode rather than spidering. The problem is with
multiple indexes. I index each day like:
For each day, there is a corresponding index, like:
I then use a search against *.swish-e. Duplication occurs when an
article exists for more than one day - thus I use a Berkley DB file for
keeping track of checksums between days.
> are the URLs you are passing to swish-e unique?
> Patrick O'Lone scribbled on 4/26/06 8:54 AM:
>> I've been using swish-e for sometime now. I think it's a great
>> product, but I've had to use a special hack to avoid heavy
>> duplication issues within the index. I use MD5 checksums in an
>> external Berkley DB file for maintaining uniqueness within a
>> collection of documents - I was wondering if there is a better way.
>> Is it possible to have a unique key in a swish-e index file or would
>> that require the incremental mode feature? Also, will version 2.4.4
>> be coming out soon or is it on hold indefinitely? Thanks for any
Software Project Manager
E-mail ... email@example.com
Phone .... 309-743-0809
Fax ...... 309-743-0830
Received on Thu Apr 27 08:27:31 2006