Had my down time, now getting back into this again. This time it's for the workplace. We have several internal documentation sites, and search all of them individually can be a pain. So I decided to spider all of them and make them all searchable via swish.cgi. I have it working fairly well so far, but am having a hard time spidering sites that require authentication. All the sites are being indexed individually, and this is the basic conf that I am using:
###############################
IndexDir spider.pl
SwishProgParameters default http://restricted-website.com/dir/index.php
IndexFile /path/to/indexes/restricted-website.index
StoreDescription HTML* <body> 200000
##############################
Now, I thought that somehow I was supposed to put the following to pass the username and password:
##############################
http://username:password@restricted-website.com/dir/index.php
##############################
But that is not working very well.
I am calling swish-e as follows, to do the indexing:
swish-e -S prog -c restricted-website.conf
Anyone have any thoughts?
Peace, Troy
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Tue Jun 15 10:09:07 2010