On Fri, Jul 11, 2008 at 01:11:56AM -0700, Jo Rhett wrote:
> (query string?)
> So while debugging a different problem I looked at my httpd logs and
> realized something I'd apparently missed before. The swish-e spider
> is looping over the same files dozens and dozens of times, each time
> with different query arguments. Because all of the links on the site
> contain a query_string containing the page they came from and a unique
> id for the visitor (and a dynamic toolbar has links to every page),
> this means that each page is indexed N-1 times, where N is the number
> of pages on the site.
makes it hard for browsers to do any caching.
> Is there an option to tell the swish spider to ignore the query string
> when considering URLs? I realize that this would be inappropriate
> for many sites, but it is essential for this site, so an option would
> be very useful.
Quick search of the archives turns up this:
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Users mailing list
Received on Fri Jul 11 09:43:55 2008