Skip to main content.
home | support | download

Back to List Archive

Re: question about robots.txt

From: Bill Conlon <bill(at)not-real.tothept.com>
Date: Wed Jul 28 2004 - 16:55:49 GMT
I modifed the callback function:

	   test_url => sub { 
	   my $uri = shift;
	   return 0 if $uri->path =~ /\.(gif|jpeg|png|gz)$/;
	   return 0 if $uri->path =~ qw(/archive/index.html);
	   return 1;
	   },


>--- Bill Conlon <bill@tothept.com> wrote:
>> I have a robots.txt file:
>> 
>
>Is the robots.txt file located at /robots.txt ?
Yes
>Maybe the spider can't get to it because of authentication (I can't get
>to http://beowulf3.tothept.com/robots.txt )
No, the spider configuration includes username and password in base_url
>
>
>=====
>Greg Fenton
>greg_fenton@yahoo.com
>
>
>	
>		
>__________________________________
>Do you Yahoo!?
>New and Improved Yahoo! Mail - 100MB free storage!
>http://promotions.yahoo.com/new_mail 


Bill Conlon

To the Point
345 California Avenue Suite 2
Palo Alto, CA 94306

office: 650.327.2175
fax:    650.329.8335
mobile: 650.906.9929
e-mail: mailto:bill@tothept.com
web:    http://www.tothept.com
Received on Wed Jul 28 09:55:58 2004