On Wed, Jan 05, 2005 at 12:15:48PM -0800, Lance Perry wrote:
> I am spidering a site (spidering is being called from the swish indexing).
> The site contains .exe and .zip files. I DO NOT want those files to be
> indexed (or even downloaded).
You do it the same way as the example in the spider.pl docs for skipping .gif,
jpeg and .png, but specify \.exe and \.zip instead or use robots.txt
to list the files.
> User-agent: *
> Disallow: /downloads/cisco-vpn/*.exe$
That's not valid robots.txt syntax. You can't use regex patterns.
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Received on Wed Jan 5 13:03:47 2005