Figured this out I think.
IndexContents HTML2 .html
Need to use the HTML2 parser to index the contents.
MattO
-----Original Message-----
From: MattO [mailto:matto@tellme.com]
Sent: Wednesday, May 05, 2004 9:05 PM
To: 'swish-e@sunsite.berkeley.edu'
Subject: obeyRobotsNoIndex & IndexContents
Using SWISH-E 2.4.2 on Solaris 5.8 i386
If, in my swish-e cfg I set:
obeyRobotsNoIndex yes
and I have the following in my content
<meta name="robots" content="noindex">
during indexing I see:
blah.html - Using DEFAULT (HTML2) parser - (Skipped due to Robots
Excluion Rule in meta tag)
Aside from the typo in the output, that's what I'd expect.
If I then add the following directive to my cfg:
IndexContents HTML .html
and rebuild the index
blah.html - Using HTML parser - (496 words)
Apologies if this is FAQ, but can't I get swish-e to obey the
"obeyRobotsNoIndex" rule primarily and only "IndexContents" if the meta
isn't present?
Thanks.
MattO
Received on Wed May 5 21:25:56 2004