Skip to main content.
home | support | download

Back to List Archive

Re: Spidering on Windows

From: Ron Klatchko <ron(at)>
Date: Fri Oct 30 1998 - 17:14:12 GMT
David Norris wrote:
> It works fine.  However, I am having a weird little problem.  I don't think
> it is related to my changes.  The problem is 'URL disallowed by server' when
> it encounters a link to another location on my server.  It seems that the
> only URL allowed is http://myserver , even http://myserver/ is not allowed,
> neither is http://myserver/index.html

Swish behaves like a good spider and obeys robots.txt.  Looking at

> telnet 80
Connected to
Escape character is '^]'.
GET /robots.txt HTTP/1.0

HTTP/1.1 200 OK
[headers cut...]
Content-Type: text/plain

User-agent: *
Disallow: /
Allow /tomahawk

Hmm, I wonder what it could be...

          Ron Klatchko - Manager, Advanced Technology Group           
           UCSF Library and Center for Knowledge Management           
Received on Fri Oct 30 09:24:28 1998