I realize that the behavior spider.pl has now is as
described in the docs, and that I brought this up on
the list about 9 months ago. Apologies for repeating
Though I still think the feature should work as I
describe below, I realize this decision's already been
made and documented, and I defer to the decisionaires.
Thanks also for the workaround tips, Bill.
--- Bill Moseley <email@example.com> wrote:
> On Tue, Sep 27, 2005 at 09:52:39AM -0700, J Robinson
> > Because I don't want to slam the server, even if I
> > using keepalives to minimize the impact.
> > pages off the server could still cause significant
> > load (ie, dynamic pages). If I wanted to hit pages
> > fast as possible, I'd set delay_sec to 0! :)
> They you can just sleep in any of the call-back
> Normally, the point of the keep-alive connection is
> to make the best
> use of the web server's limited resources. If one
> client is holding a
> keep alive connection open then it should be busy
> using that
> connection. Otherwise, free it up for another
> client to use.
> Might want to look at other issues if it only takes
> one busy
> connection to kill the server's performance. But,
> again, it's easy to
> put a delay in, say, test_url if you want to add
> additional delay.
> Just don't delay so much that the web server kills
> the keep-alive
> connection. Then you are just making the problem
> Bill Moseley
Yahoo! Mail - PC Magazine Editors' Choice 2005
Received on Tue Sep 27 10:18:25 2005