Skip to main content.
home | support | download

Back to List Archive

Re: Swish and sessions or cookies

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Tue Oct 22 2002 - 17:06:37 GMT
[Moved to the swish-e list]

At 06:15 PM 10/22/02 +0200, Dominik Marti wrote:
>Some online-journals are protected by username and password which will 
>also be saved  in our database. So when indexing some protected urls 
>using a recursion depth more than 1, username and password are always 
>needed. Could you tell me if swish-e can handle sessions or cookies, so 
>it wouldn't be necessary to always ask the database for the username and 
>password. It would be a better solution working with either sessions or 
>cookies. I wasn't able to find this information in the docs.

Swish-e indexes files, it doesn't really know anything about passwords or
cookies.

Obviously, there's different ways to use "passwords" on the web, so your
question can't really be answered.  A session is something else, too, but
is often maintained by use of cookies or URLs or form fields.

Does the spider script that comes with swish-e (spider.pl) support cookies?
 Yes.

Does the spider script ask for a password when fetching a doc that is
protected?  Yes.

Could the spider script be easily modified to lookup username and passwords
in at database based on the URL?  Sure, that would not be difficult.

Could passwords looked up in a database be stored in a cookie?  That all
depends on the web server.


>Some articles from several online journals could be the same. Can 
>swish-e handle this situation so it filters similar articles?

Not swish, but the spider can use a MD5 checksum to filter out duplicate docs.
But the docs would have to be exactly the same for that to work.



-- 
Bill Moseley
mailto:moseley@hank.org
Received on Tue Oct 22 17:21:07 2002