make sure you are indexing as XML, not as HTML (which is the default).
try using the -T option to see what, exactly, is being indexed.
use 1 test document while testing.
Mike.Fountain@worldspan.com scribbled on 1/31/06 1:28 PM:
> Sorry if this is a repost, didn't see the first message come through.
>
>
>
> I've got a document with an XML format similar to
>
> <VisioDocument>
> <DocumentProperties>
> <Title>LithiaSprings</Title>
> <Creator>MFountain</Creator>
> </DocumentProperties>
> </VisioDocument>
>
>
> There is actually a ton of other tags. But, for example purposes, lets
> assume I only want to index the contents of the <Creator> tag.
>
> In my indexing config file I have:
> MetaNames Creator
> UndefinedMetaTags ignore
>
>
> Like that, I thought I would get the contents of the Creator tag indexed
> and nothing else in that document.
> However, when I run the index, it says zero words indexed.
>
> Just to make sure things are ok, I set
> UndefinedMetaTags index
>
> And, set like that, the indexing takes some time, and I'm assuming its
> indexing all the stuff I don't want.
>
>
> To do a bit of debugging, I set
> UndefinedMetaTags error
>
> And, it comes back about not understanding the <VisioDocument> tag.
>
>
> So, how do I index just the nested tags I want without indexing everything
> in the document?
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Tue Jan 31 11:35:24 2006