Skip to main content.
home | support | download

Back to List Archive

Re: spider.pl's delay_sec & keep_alive

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Jan 20 2005 - 14:13:13 GMT
On Thu, Jan 20, 2005 at 05:21:04AM -0800, J Robinson wrote:
 
> Seems to me that the delay_sec should be respected
> even if the connection is keep-alive. Just because a
> connection to a server is kept alive doesn't mean that
> fetching pages doesn't cause a load on the server!

Just because there's keep alive connections doesn't mean they are all
keep alive connections.  When the connection is not a keep alive
connection then the spider waits.  Otherwise, if the server is
allowing the keep alive connection then it's saying "hit me again!"

In general, I think you want your server processes working.  You don't
want to go to all that trouble of tuning your web server just to have
some silly spider holding your precious servers hostage.  Think about
mod_perl tuning using a proxy front end.  The idea is to free up your
server from slow clients to all it to keep serving.

The best thing for the spider to do to be nice is use keep alives.
That allows the spider to get its content with the least overhead.

The server admin can then tune the number of connections to allow for
available memory and usage averages.

> Perhaps a separate keep_alive_delay_sec could be used?

Yes, because even with keep_alive enabled not all connections are keep
alive.  So you want a separate delay.

And you can have that easily.  Just add a sleep statement in the
test_response(), filter_content(), or (the new!) output_function()
callback.  You should be able to check the response header to see if
the connection is keep alive or not before deciding if you want to
sleep or not (if also using a delay_sec).

So, I would not want to have a special delay setting for keep alive
connections.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Thu Jan 20 06:13:13 2005