Skip to main content.
home | support | download

Back to List Archive

Re: Time or file limit on searching

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Oct 01 2002 - 20:59:25 GMT
At 11:32 AM 10/01/02 -0700, Craig Baetz wrote:
>Can anyone offer any advice on limiting the number of files searched or the
>time to search?  I have an application with a great many sites to search,
>and want to limit sites that take too long to search.

I'm not sure I understand what you are asking.  How to limit the time a
search is running?

I normally use a SIGALRM and kill off the process if I get the signal.  You
can't do that under Win32 (but I'd think there must be something similar.
Anyone??)

Or are you talking about spidering and how long to wait for a response from
the remote server after requesting a URL?  The perl LWP code has a timeout,
although I'm not sure if it works on win32 or not.  I believe there's a
timeout setting for spider.pl.

>I am planning to search each site separately, and merge the results into a
>'master' index after the search.  Have others found this to be a good way to
>build indexes?

Seems reasonable.

If I was doing that often, and it was slow spidering, I'd probably cache
the files locally and index those.  Might be able to find some software to
mirror the sites you wish to index locally.

I don't think I answered your question, though.

>Background:  running swish-e.exe on Win32, in -S mode.

-- 
Bill Moseley
mailto:moseley@hank.org
Received on Tue Oct 1 21:03:17 2002