Douglas Smith wrote:
>Yes, I would like to know more about this. I got this to work, but not
>in a nice way. I used the same filter line, with the "unzip content.xml"
>and there was lots of xml to parse. But XML2 would return nothing for
>some reason, and no content would get indexed.
After some mailing up-and-down to Bill Moseley our server indexed last
night pretty succesfull, based on following configuration:
FileFilterMatch "/usr/bin/unzip" "-p \"%p\" content.xml" /\.(sxw|sxc|sxg)$/i
IndexContents XML* .xml .sxw .sxc .sxg
StoreDescription XML* <text:p> 20000
Seems like all our OO-files on the network are now stored with
description. Of course this only involves the 'plain'-text in the
documents. OO-metatags like 'Author' are not included, since OO stores
it in another xml file. I might start writing a kind of "oo2xml"
script one of these day in order to deliver a more informative xml to
However the original mentioned error message remained:
Warning: XML parse error in file './QU030423im01.sxw' line 2. Error: not well-formed
Allthough it doesn't seem to bother the process, it was annoying to me.
Turns out my swish-e was not compiled with XML2.
Did an upgrade of my libxml2 and recompiled swish-e. Now the error message is disappeared.
Received on Wed May 21 08:19:54 2003