Skip to main content.
home | support | download

Back to List Archive

Re: XML Meta name question.

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Oct 02 2002 - 15:16:30 GMT
At 07:39 AM 10/02/02 -0700, William Bailey wrote:
><?xml version="1.0"?>
><recording title="16 Crappy Pop Songs." subtitle="The complete cheese 
>collection.">
>  <artist class="main">Main artist for this recording</artist>
>  <artist class="supporting">Supporting artist</artist>
>  <track title="A nice kind of life.">
>    <artist>Another processed band from a TV show.</artist>
>    track desc blah blah blah blah track note.
>  </track>
>  <track title="Ohhh it good to be owned by a record label">
>    <artist>The cant sing crew</artist>
>    track desc blah blah blah blah track note again.
>  </track>
>  recording desc blah blah blah description and stuff
></recording>


>Now the only problem i see here is that i cant just do a search for the
track 
>artists as it has added the track artists to the generic [artist] meta tag. 
>What i really need is for there to be an option to create the meta tags
based 
>on its current position in the XML tree so for example the main artists
would 
>be recording.artist but the track artists would be searchable on 
>recording.track.artist this way you could search on a specific branch of the 
>tree and will therefore not have to give each element a unique name.

Yes, you are right.  Swish flattens the tags out.  Swish uses a SAX parser
so it doesn't really know anything about the structure of the tree, but it
probably could maintain some state of nested metanames.

Probably premature in saying this, but I'll bet it would not be too hard to
add that feature so that:

<first>
   <second>  -> becomes meta first.second
      <third> -> becomes first.second.third

Is that what you have in mind?

The parser actually does track which (nested) metanames are in use, but
only if they are defined as metanames (it uses that list to determine how
to index a given word at any point).

It might get confusing when mixed with UndefinedXMLAttributes and
XMLClassAttributes.

Another issues that's been bugging me for a while is if a tag is ignored
some how (e.g. IgnoreMetaTags) then everything inside that tag is ignored,
even if it's defined as a Metaname.  That makes it hard to pick just some
data that's nested inside the doc.

XML is good at storing complex data, and so it makes it hard for something
like swish to have a general way to parse it into a set of non-hierarchical
tags.

The quick fix for you will be to use an XML parser yourself and rewrite the
docs on the fly with -S prog.  But I'll look at the code too.  You are
welcome to check out parser.c, too.  Always good to have someone look over
the code.


-- 
Bill Moseley
mailto:moseley@hank.org
Received on Wed Oct 2 15:20:17 2002