Re: Probably dumb newbie question.

From: Bill Moseley <moseley(at)>
Date: Thu Aug 26 2004 - 15:16:42 GMT
On Thu, Aug 26, 2004 at 04:07:04AM -0700, Nic Gibson wrote:
> I'm having an odd problem with swish-e 2.4.2. I have an index generated using 
> Contrary to my expectations it appears to be indexing the href content
> of html anchors. I've attached the index configuration file to this message.  The only
> odd thing I can think of about this particular website is that the URLs don't have
> file extensions (see However, the content type
> is definitely correct.

You might set:

   ParserWarnLevel 9

All I saw were some errors about HTML entities that couldn't be mapped
to 8859-1.

Otherwise, can you show the text of the hrefs that is being indexed?
You will likely get better help if you can provide a working example.

I added a "/" to WordCharacters (along with a-z0-9) and used -T
indexed_words and didn't see anything that looked like a URL path.

Bill Moseley

Received on Thu Aug 26 08:17:15 2004