Skip to main content.
home | support | download

Back to List Archive

Re: StoreDescription for XML, indexing Powerpoint

From: Andrew Smith <asmith(at)not-real.compbio.berkeley.edu>
Date: Wed Jan 29 2003 - 23:42:57 GMT
> > To use StoreDescription for XML, you need to give a tag in the XML from 
> > which to extract the description text; and the same is true for storing 
> > descriptions of HTML. This makes sense for HTML (which is a single 
> > standard where you can use e.g. <body> as the StoreDescription tag), but 
> > doesn't seem to for XML (which is extensible and thus you define your own 
> > tags and format). I.e. the files you are indexing could contain many 
> > different types of XML files and there will be no single XML tag that they 
> > all share which could be used as the common StoreDescription tag. So it 
> > seems StoreDescription should be changed for XML files to either allow 
> > entire (up to some number of characters, as TXT descriptions are 
> > specified) XML files to be stored or to allow multiple tags to be 
> > specified. Is there any way to get around this in the current Swish-e to 
> > store entire XML file contents as descriptions?
> 
> An example would be helpful.

A simple example would be that you might store two different kinds of XML 
files within a directory you want to index and make searchable:

(1) messages between people

<note>
<to>John</to>
<from>Andrew</from>
<heading>reminder</heading>
<body>Don't forget our meeting.</body>
</note>

(2) employee records

<employees>

<employee>
<name>John Doe</name>
<ssn>432-87-8256</ssn>
<position>programmer</position>
</employee>

..
</employees>

Then, there is no common tag between these two types of XML files, so you 
cannot give Swish-e a single XML tag to use as the description.

> 
> True, Swish probably can not deal with all possible uses of XML.
> 
> If you want to use multiple tag names for the description you might be
> able to use the PropertyNameAlias directive.

Yes, that might work. I'll look into it. Thanks.
Received on Wed Jan 29 23:43:25 2003