Skip to main content.
home | support | download

Back to List Archive

Re: Spider, but not index?

From: David Wood <dwood(at)>
Date: Wed Jun 23 2004 - 14:04:39 GMT
In your spider config file, put something like this:

@servers = (

         test_response => \&test_response,


sub test_response {

     @SNUBBED_URLS = (

     my $uri = $_[0];
     my $server = $_[1];
     my $url = "";

     # These URLs should be spidered, but not indexed, as they're too generic.
     foreach $url (@SNUBBED_URLS) {
         $server->{no_index} = 1 if ($uri->path =~ /$url$/);





At 15:40 Wednesday 23-6-2004, David VanHook wrote:

>Is there a relatively easy way to get SWISH-E to spider a page (i.e., to
>follow all of the links on it), but to not index the contents of that same
>page?  I've tried using FileRules title in the config file, but am having no
>luck -- I get a Bad Directive error, even when I paste in the code directly
>from the online docs.
>Dave VanHook

Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
Received on Wed Jun 23 14:04:41 2004