Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Trac & wiki authorization

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Sep 05 2008 - 15:14:16 GMT
On Fri, Sep 05, 2008 at 03:59:41PM +0800, Tian Xinchun wrote:
> Dear experts,
> 
> I am trying to index the password protected areas like Trac & mediawiki, I 
> have succeeded in the public and other basic authorization web pages using 
> "credentials => 'username:password'". It seems that this solution does not 
> work for Trac & mediawiki. Any help?

The credentials are for basic auth.  You would need to alter your
script to log by posting to the form.

I'm not sure of the details, but what you likely need is to alter the
spider to make a request (a POST) to the login form before you start
to spider using the same user agent (and thus the same cookie jar)
that the spider uses.  Then that will save the cookie and you should
be able to spider.

I'd also look at just indexing the data directly from the database
instead of spidering, if you can figure out the URL mapping.


> 
> Following is part of my spider.conf which has problem.
> my %privatewiki = (
>     email       => 'tianxc@ihep.ac.cn',
>     base_url    => 'https://wiki.bnl.gov/dayabay-private/index.php?
> title=Main_Page',
>     delay_sec   => '1',
>     max_depth   => '2',
>     credentials => 'username:password'
> );
> 
> my %repository  = (
>     email       => 'tianxc@ihep.ac.cn',
>     base_url    => 'http://dayabay.ihep.ac.cn/tracs/dybsvn/browser/',
>     delay_sec   => '1',
>     max_depth   => '5',
>     credentials => 'username:password'
> );
> 
> Thanks
> Xinchun Tian
> 
> 
> 
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
> 

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs

_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Sep 5 11:14:18 2008