Skip to main content.
home | support | download

Back to List Archive

Re: Unique Indexes

From: Patrick O'Lone <polone(at)not-real.townnews.com>
Date: Thu Apr 27 2006 - 15:27:25 GMT
I use the file system mode rather than spidering. The problem is with 
multiple indexes. I index each day like:

2006/04/01/articles/
2006/04/02/articles/
.
.
etc.

For each day, there is a corresponding index, like:

20060401.swish-e
20060402.swish-e

I then use a search against *.swish-e. Duplication occurs when an 
article exists for more than one day - thus I use a Berkley DB file for 
keeping track of checksums between days.
>
> are the URLs you are passing to swish-e unique?
>
> Patrick O'Lone scribbled on 4/26/06 8:54 AM:
>> Hello,
>>
>> I've been using swish-e for sometime now. I think it's a great 
>> product, but I've had to use a special hack to avoid heavy 
>> duplication issues within the index. I use MD5 checksums in an 
>> external Berkley DB file for maintaining uniqueness within a 
>> collection of documents - I was wondering if there is a better way. 
>> Is it possible to have a unique key in a swish-e index file or would 
>> that require the incremental mode feature? Also, will version 2.4.4 
>> be coming out soon or is it on hold indefinitely? Thanks for any 
>> feedback!
>>
>


-- 
Patrick O'Lone
Software Project Manager
TownNews.com

E-mail ... polone@townnews.com
Phone .... 309-743-0809
Fax ...... 309-743-0830
Received on Thu Apr 27 08:27:31 2006