Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Indexing starts all over again

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Wed Aug 12 2009 - 13:34:52 GMT
Paras Fadte wrote on 08/11/2009 04:38 AM:
> Hi,
> 
> I have had strange problem while indexing with swish-e wherein it
> appears to start indexing data all over again as if it is in some
> loop. When I try with say max_depth=1 or 2 it works fine . Can anybody
> point out what could be happening here ?
> 

Sounds like the spider.pl (I assume you are using that) is not 
identifying URLs as duplicates. You could try turning on the md5 option 
as described in the documentation:
http://swish-e.org/docs/spider.html#duplicate_documents

Search for 'use_md5' in the docs and make sure you have the Digest::MD5 
perl module installed from CPAN.


-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
gpg key: 37D2 DAA6 3A13 D415 4295  3A69 448F E556 374A 34D9
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Wed Aug 12 09:35:22 2009