On Wed, Jan 10, 2007 at 06:48:24AM -0800, James wrote:
> Thanks, I'll check out the link. By the way, I did take the time to
> read up on UTF-8 and user agents. But since there is a plethora of
> information and since you guys are the experts, I am asking you
> because I figure that you will be able to speed up my learning and/or
> point me to some information that you are already aware of. That's
> why novices seek out help from experts in forums and discussion
> groups, right? Believe me, I have spent hours and hours already,
> before even posting, trying to find useful information, even on
> Mozilla's own site.
Yes, it's a real time killer trying to learn this stuff. Seems like
that's a big chunk of my day. Ignore my early morning sarcasm -- the
list archives are full of it.
> > and in regard to the user agent question: I believe that one reason bots
> > identify themselves as particular user agents is because they want to receive
> > the same responses that the server would hand out to those non-bot agents.
> So, is this a real benefit to the Swish-e Spider and how would it be
Most people are indexing sites they run, so they know if their content
looks at user agent. You might have a intranet with content that you
don't control that checks the agent string, but you could see that
when indexing (by having the spider tell you files rejected due to
Google spiders everyone so it fakes the UA for those misguided
How would it be accomplished? You mean how to set the agent string?
agent => "Mozilla/5.0 (compatible; Googlebot/2.1; http://www.google.com/bot.html)",
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Received on Wed Jan 10 07:02:31 2007