On 10/05/2007 01:37 PM, Ravi Murthy wrote:
> Compounding the queries doesn't quite give the results that I want - because
> it combines the results at the document level but not a specific node.
> Consider the document below.
>
> <root>
> <a>
> <b>bar</b>
> </a>
> <a>
> <c>foo</c>
> </a>
> <b>foo</b>
> </root>
>
> a.b = (foo) -- SHOULD BE FALSE
>
> but
>
> a = (foo) AND b = (foo) -- WILL RETURN TRUE
>
swish-e's parser "flattens" the DOM at indexing time. In fact, swish-e's parser
knows nothing about the DOM at all. It uses SAX. So no, to answer your original
question, Swish-e doesn't have that kind of feature built-in.
However, I might accomplish the same effect by pre-parsing the XML and feeding
to the -S prog option. Then you could mimic the hierarchy with the tag names
themselves. That has some trade-offs, since you could't search for just 'a =
foo' for example, unless your pre-parsing flattened the tags in all the
combinations that you'd want to be able to search for later.
[pek@dewpoint:~/tmp]$ cat conf
UndefinedMetaTags auto
DefaultContents XML
[pek@dewpoint:~/tmp]$ cat nest.xml
<root>
<a.b>bar</a.b>
<a.c>foo</a.c>
<b>foo</b>
</root>
[pek@dewpoint:~/tmp]$ swish-e -w a.b = foo
# SWISH format: 2.5.6
# Search words: a.b = foo
# Removed stopwords:
err: no results
[pek@dewpoint:~/tmp]$ swish-e -w a.c = foo
# SWISH format: 2.5.6
# Search words: a.c = foo
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.007 seconds
1000 nest.xml "nest.xml" 66
--
Peter Karman . peter(at)not-real.peknet.com . http://peknet.com/
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Oct 5 14:50:15 2007