I'm using spider.pl to index my Joomla! Site, and it's spidering it putting the PHP session variable (?PHPSESSID=askjhdskljashdk) on the end. In the process of indexing the site, this tag changes a few times with a new session ID, so multiple copies of the same document get indexed. Also, the link appears in the DB with said session variable in it.
I was able to modify my swish.conf file to remove the PHP session ID variables:
ReplaceRules regex /\?PHPSESSID.*$//i
ReplaceRules regex /&PHPSESSID.*$//i
but multiple entries still appear in the database for each document. What am I doing wrong?
Mindshare Interactive Campaigns, LLC
202.654.0832 - www.mindshare.net
Received on Tue Dec 6 12:39:33 2005