Bill Moseley wrote:
> On Mon, Feb 02, 2004 at 10:02:38PM -0800, Peter Karman wrote:
>
>
>>The difference seems to be that the XML2 version splits words on tags,
>>while the HTML2 parser does not.
>
>
> That might be true in some cases. It's been discussed on the list
> before how to deal with
>
> <tag>text</tag><tag>other</tag>
>
> is that one or two words?
>
>
ah. So this:
http://swish-e.org/Discussion/archive/2003-12/6688.html
refers to this issue:
* Insert whitespace between tags Parser.c was updated to flush the
text buffer before and after every (non-inline HTML) tag.
The problem was that:
foo<tag>bar</tag>baz
would index as a single word "foobarbaz".
Where is the list of non-inline HTML tags defined? In the libxml2 HTML
parser, or in swish-e somewhere?
>>-h[option]
>
>
>
> Someone still uses -h?
>
>
<grin> you mean instead of --help or something?
> No, did you look at the code in check_html_tag()?
>
I will.
> As for the rest of your question... you will have to wait. My wife says
> I have to make the coffee.
>
hope it was good and greasy.
pek
--
Peter Karman - Software Publications Programmer - Cray Inc
phone: 651-605-9009 - mailto:karman@cray.com
Received on Tue Feb 3 09:12:16 2004