Skip to main content.
home | support | download

Back to List Archive

Re: swish-e-1.3.2-PHRASEk.tar.gz released

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed May 17 2000 - 13:28:53 GMT
At 05:23 AM 05/17/00 -0700, marc 'embee' borms wrote:
>e.g. this file :>

<HTML>
<PATIENT>
<NAME>Ray Charles</NAME>
<REQUEST>none</REQUEST>
<DRUG>
Aspirine : 2
Valium   : 3
</DRUG>
</PATIENT>
</HTML>

>The program works now when executing :
>
>swish-e ... -w "name=Ray" .
>
>It should work too when asking for :
>
>swish-e ... -w "patient=Ray" ( because the content of "name" is also content
>of "patient" ).

How much nesting is ok?  Should

swish-e -w "html=Ray" work, too?

I'm somewhat concerned about this change in MetaNames.  First, do I
understand correctly now that:

<META NAME="foo" CONTENT="bar">

Is now the same as

<FOO>bar</FOO>

Correct?  

And searching for -w "foo=word" would find word in either place if a
document had both a <META> named FOO and a <FOO> tag?

Will <FOO> be found anywhere in the document, or just within <BODY> tags?

What happens if there's an open <FOO> and no </FOO> found?

What happens if:
<FOO>
   some text
     <FOO>
         more text
     </FOO>
   something else
</FOO>

What happens if both FOO and BAR are Metanames and:

<FOO>
   outside
      <BAR>inside</BAR>
   last
</FOO>

Will the word "inside" be found in both FOO and BAR?

What if <BAR> is not a Metaname.  Will "inside" be included in FOO.


I might like to have a setting for indexing to say STRICT_METANAMES to mean
that only <META> tags would be indexed, in case there might be a conflict
with a <META> tag and an XML tag.



Bill Moseley
mailto:moseley@hank.org
Received on Wed May 17 09:32:42 2000