Hi everyone,
I am using swish-e 2.0.3 to index a site via the HTTP access method.
This site is structured in such a way, that every page can be produced
with (normal) or without ("printerfriendly") a navigation menu on one
side. All the pages are PHP-pages. The priterfriendly formatting is
triggered by clicking on a link containing the page's uri and the
additional variable "printerfriedly", resulting in links to pages like
http://foo.bar/foo.php?printerfriendly=1.
With the regular swish-e http indexing every page on this site is
indexed twice, since the printerfriendly-link is begin followed by the
spider.
I have tried to ReplaceRules, but this only clears up the URI which is
written to the index, it does not prevent indexing the page.
Using a hidden field in a form is only a partial solution, unless some
kind sould could enlighten me on how to use regular text as a submit
button.
Is there any chance to avoid this double indexing?
Is there a general way to strip off all parts of the query-string and
*then* compare the url to be indexed? I could not find anything about
this in the manual or in the list archives.
Maybe there is a HTML-approach which does the job?
Thanks in advance.
Regards -- Stephan
--
Stephan Engelke engelke@gmx.net
*** I.R.S.: We've got what it takes to take what you've got! ***
Received on Mon Oct 2 10:02:17 2000