Hi Adam,
I'm not sure why it's any more dangerous to require/allow the swish-e
spider to login to an application than any other user agent that
presents credentials. In fact for a public facing application, far
more checks can be applied (username/password;IP_address;one-of-a-
kind user agent) to the spider than is feasible with a normal user's
login.
Merely enabling cookies by itself presents just as much risk of forgery.
Anyway, here's a snip from my @servers:
@servers = (
{
base_url => 'http://my.domain.com/login.app?
_function=checkpw&userid=swishe&password=swishe&remember=no',
use_cookies => 1,
# debug => DEBUG_URL | DEBUG_SKIPPED | DEBUG_FAILED |
DEBUG_HEADERS,
delay_sec => 1,
test_url => sub {
my $ok = !($_[0]->path =~ /login.app/ && $_[0]-
>query =~ /_function=logout/);
return 1 if $ok;
return; },
...
Essentially, the spider logs in as the user 'swishe' so it sees the
same content as any similarly privileged user. remember=no means
don't give swish-e a long-term cookie to re-authenticate with.
use_cookies allows the application to provide, and swish-e to use the
session cookies needed for access
test_url keeps the spider from following a link to log out, to assure
we follow all links.
Bill
On Feb 5, 2008, at 1:05 PM, Adam Douglas wrote:
> Hi William. Well that would be a workable solution, however not one
> that
> should be used in my opinion. Its to dangerous and should not be
> necessary. Thanks for the reply and suggestion.
>
> Best,
> Adam
>
>> Date: Wed, 23 Jan 2008 13:14:20 -0800
>> From: William Conlon <bill@tothept.com>
>> Subject: Re: [swish-e] How do I index via HTTP when authentication is
> required?
>> To: Swish-e Users Discussion List <users@lists.swish-e.org>
>>
>> I wrote a backdoor in my login application that allows specified IP
> addresses to login via GET, in order to have a simple way
>> for swish-e to access protected content.
>>
>> Then just create a username/password combination for swish-e to login
> with.
>
> This message (including any attachments) is intended only for
> the use of the individual or entity to which it is addressed and
> may contain information that is non-public, proprietary,
> privileged, confidential, and exempt from disclosure under
> applicable law or may constitute as attorney work product.
> If you are not the intended recipient, you are hereby notified
> that any use, dissemination, distribution, or copying of this
> communication is strictly prohibited. If you have received this
> communication in error, notify us immediately by telephone and
> (i) destroy this message if a facsimile or (ii) delete this message
> immediately if this is an electronic communication.
>
> Thank you.
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Tue Feb 5 17:14:14 2008