On 1/10/07, Bill Moseley <moseley@hank.org> wrote:
> On Wed, Jan 10, 2007 at 07:14:01AM -0800, James wrote:
> > So, you think that it is a case of them just putting that information
> > into their user / server agent string to trick the server into
> > thinking they are viewing through a browser? That's an interesting
> > thought. So, maybe they aren't running any spiders through browsers.
> > Or is it still possible that they do run the spider through a browser?
>
> The user agent string is just sent with a request. The server can
> decided what to do with it -- the vast majority ignore it. That's
> what I'd recommend doing, too.
Understood.
>
> Do you have an actual problem with this? I mean, do you know for a
> *fact* (not because you guess so) that some server is not sending you
> a document based on the user agent string? Have you discussed this
> with that site's admin?
No problem, except that I am thinking that the Gecko based spider
probably is more powerful and filters out more garbage (like the
hidden text issues).
>
> Even if you do have a real problem, then just set the agent string. I
> think we covered how to do that now.
I've already done it and tested it - my new user agent / server agent
are working well. So, that's not the issue.
>
>
> --
> Bill Moseley
Thanks for your time,
James
Received on Wed Jan 10 07:45:53 2007