Skip to main content.
home | support | download

Back to List Archive

Re: spidering, delay_sec and keep_alive

From: J Robinson <jrobinson852(at)>
Date: Tue Sep 27 2005 - 17:18:23 GMT
I realize that the behavior has now is as
described in the docs, and that I brought this up on
the list about 9 months ago. Apologies for repeating

Though I still think the feature should work as I
describe below, I realize this decision's already been
made and documented, and I defer to the decisionaires.

Thanks also for the workaround tips, Bill.


--- Bill Moseley <> wrote:

> On Tue, Sep 27, 2005 at 09:52:39AM -0700, J Robinson
> wrote:
> > Because I don't want to slam the server, even if I
> am
> > using keepalives to minimize the impact.
> Retrieving
> > pages off the server could still cause significant
> > load (ie, dynamic pages). If I wanted to hit pages
> as
> > fast as possible, I'd set delay_sec to 0! :)
> They you can just sleep in any of the call-back
> functions.
> Normally, the point of the keep-alive connection is
> to make the best
> use of the web server's limited resources.  If one
> client is holding a
> keep alive connection open then it should be busy
> using that
> connection.  Otherwise, free it up for another
> client to use.
> Might want to look at other issues if it only takes
> one busy
> connection to kill the server's performance.  But,
> again, it's easy to
> put a delay in, say, test_url if you want to add
> additional delay.
> Just don't delay so much that the web server kills
> the keep-alive
> connection.  Then you are just making the problem
> worse.
> -- 
> Bill Moseley

Yahoo! Mail - PC Magazine Editors' Choice 2005
Received on Tue Sep 27 10:18:25 2005