Skip to main content.
home | support | download

Back to List Archive

Re: XML et PropertyNames printing results

From: Prosper Correa <prosper(at)not-real.correa.org>
Date: Sat Apr 28 2001 - 15:44:59 GMT
The question is that Swish let the user to get some properties in the results (cf.
-p option).
So, if the returned property is not the one, it would mean that such option is not
usefull ;-)

I think that returning the rearest property would be a good solution for the
moment.

How about it ?

Prosper



Bill Moseley wrote:

> At 10:57 AM 04/22/01 -0700, Prosper Correa wrote:
> >Here is the XML file :
> >
> ><ROOT>
> >
> ><COUNTRY>
> ><nic>0017</nic>
> ><naf>china</naf>
> ></COUNTRY>
> >
> ><COUNTRY>
> ><nic>0013</nic>
> ><naf>japan</naf>
> ></COUNTRY>
> >
> ></ROOT>
> >
> >
> >I get this result :
> >
> >1000 ./test/005403.txt "005403.txt" 57 1 "japan" "0013"
> >
> >And the good result shall be :
> >
> >1000 ./test/005403.txt "005403.txt" 57 1 "china" "0017"
> >
> >Allways the last information in the file is printed
> >and not the RIGHT information.
>
> Why is the second one the "right" information?   Swish indexes the file,
> but it doesn't understand the structure of your documents -- it doesn't
> keep properties related to any outer nested metaname.  You search for
> NIC=0017 and it tells you that can be found in ./test/005403.txt.  That's all.
>
> You have a couple of choices -- one would be to split your documents into
> separate parts and index those a separate documents.  You can physically
> spilt the documents, or you could use the "prog" feature to parse your
> documents and index the parts as separate documents.
>
> The other option, that I'm never tried, is to use the LST document type -
> but that uses the first <tag> as the document separator ( which would be
> <root> in your case, I believe).  (Is this correct, Jose?).
>
> Bill Moseley
> mailto:moseley@hank.org
Received on Sat Apr 28 15:45:26 2001