--- Bill Moseley <firstname.lastname@example.org> wrote:
> On Mon, Oct 31, 2005 at 06:49:46AM -0800, J Robinson
> > The actual complaint is that the spider is
> > pages it shouldn't.
> Right -- I had this complaint once and it turned out
> to be a syntax
> error in the robots.txt file.
> > I'll check out the 'skipped' debug flag -- is
> > another that actually shows urls being compared
> > against the robots.txt contents?
> the spider just uses LWP::RobotUA which uses
> WWW::RobotRules. Those
> are widely used so should work as expected.
> Try setting in spider:
> use LWP::Debug 'debug+';
> although you might get more info that you want if
> spidering a lot of
> file. I typically just hack away at the module and
> throw in prints to
> see what's happening.
> Bill Moseley
> Unsubscribe from or help with the swish-e list:
> Help with Swish-e:
Yahoo! Mail - PC Magazine Editors' Choice 2005
Received on Mon Oct 31 07:51:49 2005