Skip to main content.
home | support | download

Back to List Archive

HTML Entities

From: Bill Moseley <moseley(at)>
Date: Mon Nov 27 2000 - 20:21:47 GMT
Ok, last post.

In config.h it says this for WORDCHARS

** Note that if you omit "0123456789&#;" you will not be able to
** index HTML entities. 

Why should WordCharacters have anything to do with HTML Entities?
Shouldn't HTML entities be converted *before* extracting words from the
source with WordCharacters, BeginChars, EndChars, IgnoreLast, IgnoreFirst?


Bill Moseley
Received on Mon Nov 27 20:24:23 2000