Hi.
I am running Swish-E 2.4.7 (on RHEL5) and I am trying to skip a few HTML
tags (specifically “script” as in <script ...></script>) inside HEAD and
BODY, while parsing HTML files but, despite configuration directive
“IgnoreMetaTags script style link select”, tag “script” is still being
parsed. That generates many errors such as:
error: Unexpected end tag : dt '<dt>' + listingName + '</dt>' +
What am I doing wrong? Here is my config file:
--- CONFIG FILE ---
# Index only HTML and text files
IndexOnly .html .htm
# Otherwise, use the HTML parser
DefaultContents HTML*
# Define metanames ranks
MetaNamesRank 10 title
MetaNamesRank 5 swishdefault
# Add document description to index
StoreDescription HTML* <body> 20000
# Define custom properties (meta description)
PropertyNames metadescription
PropertyNameAlias metadescription description
# Ignore total number of words when ranking
IgnoreTotalWordCountWhenRanking no
# Ignore select HTML tag
IgnoreMetaTags script style link select
# Define max depth
MaxDepth 6
# Define delay (seconds)
Delay 0
# Define location of the spider script
SpiderDirectory /usr/local/swish-e/lib/swish-e/
# Define temporary directory
TmpDir /var/tmp
--- END CONFIG FILE ---
Any help is appreciated. Thank you.
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Tue Jul 27 13:50:29 2010