Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] swish-e looping over the same files again and again (query string?)

From: Jo Rhett <jrhett(at)not-real.netconsonance.com>
Date: Fri Jul 11 2008 - 08:16:32 GMT
Sorry, forgot the obvious:

SWISH-E 2.4.3

IndexFile                       /u/snipped/privdata/live/search/ 
standard.index
IndexReport                     0
IndexDir                                spider.pl
SwishProgParameters             default http://www.snipped.com/
MetaNames               swishdocpath keywords author description
PropertyNames            keywords author description
IndexOnly               .html
NoContents 
               .zip 
  .gz 
  .Z 
  .sit 
  .cpt 
  .jpg 
  .jpeg 
  .gif .xbm .au .mov .mpg .mp2 .mp3 .dir .drx .ra .rpm .ram .pdf .ps
IgnoreLimit             90 50
IgnoreWords             CVS

On Jul 11, 2008, at 1:11 AM, Jo Rhett wrote:

> So while debugging a different problem I looked at my httpd logs and
> realized something I'd apparently missed before.  The swish-e spider
> is looping over the same files dozens and dozens of times, each time
> with different query arguments.  Because all of the links on the site
> contain a query_string containing the page they came from and a unique
> id for the visitor (and a dynamic toolbar has links to every page),
> this means that each page is indexed N-1 times, where N is the number
> of pages on the site.
>
> Is there an option to tell the swish spider to ignore the query string
> when considering URLs?   I realize that this would be inappropriate
> for many sites, but it is essential for this site, so an option would
> be very useful.
>
> -- 
> Jo Rhett
> Net Consonance : consonant endings by net philanthropy, open source
> and other randomness
>
>
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source  
and other randomness


_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Jul 11 04:16:40 2008