Thanks, Bill;
The actual complaint is that the spider is indexing
pages it shouldn't.
I'll check out the 'skipped' debug flag -- is there
another that actually shows urls being compared
against the robots.txt contents?
Thanks again
jrobinson
--- Bill Moseley <moseley@hank.org> wrote:
> On Mon, Oct 31, 2005 at 06:34:59AM -0800, J Robinson
> wrote:
> > Any tips on how I can debug this? Is there a debug
> > flag for spider.pl that shows robots.txt being
> parsed
> > and/or urls being matched against it, or anything
> like
> > that?
>
> set the debug to "skipped" and it will tell you when
> a file is skipped
> due to robots.txt.
>
> Then just run the spider on one file they say it's
> skiping.
>
> When I've debugged this in the past I found that the
> robots.txt file was
> not setup correctly.
>
> --
> Bill Moseley
> moseley@hank.org
>
> Unsubscribe from or help with the swish-e list:
> http://swish-e.org/Discussion/
>
> Help with Swish-e:
> http://swish-e.org/current/docs
> swish-e@sunsite.berkeley.edu
>
>
__________________________________
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com
Received on Mon Oct 31 06:51:03 2005