Re: RE: LWP,HTTP and HTML modules

From: Yann Stettler <stettler(at)>
Date: Wed Jan 20 1999 - 14:33:03 GMT
Mark Gaulin wrote:

> Allowing the person who configures swish to specify *some* file
> extensions that he does not want indexed/spidered is just feature.

Thanks ! At least someone who understand what is an "option"...
and why we call it like that...  :)

> We all know that some of the big search engines (altavista, infoseek,
> etc, etc) do not try to index pages that have a certain "look" to them...
> some skip "cgi" or ".exe", others anything with a "?" or "&" in them.
> I think I could say fairly certainly that AltaVista is not trying to download
> any of the gif files from my site.  It does not know *for sure* if those

That's right. Actualy, proxy/cache have a cache stop-list that
usualy ignore any URL containing a "/cgi-bin/" or a "?" (search
string). It's also a _configurable option_

> Having said all of that, I am not saying that someone specific "must go
> implement this right now, or else!"...  I'm just saying that this feature is
> not wrong or a sign of stupidity, and in some cases, is highly desirable.

Actualy it's already implemented : it's in the filesystem method. It
was just not included in the HTTP method source file. A simply
cut-and-copy is nearly enough to add it back. I did it and posted
the new file sometime ago...
Yann Stettler

TheNet - Internet Services AG              CohProg SaRL                           
Anime and Manga Services         
