On Thu, Nov 17, 2005 at 04:36:05AM -0800, Lars D. Noodén wrote:
> The mimetypes you list are not for OpenDocument but the immediate
> predecessor. They're close and the module should also work with them, but
> technically it's a different format. I'll try to make sure the module
> works with them, too.
Ah, ok. Those are what are set on my machine for use with OpenOffice.
>
> Archive::Zip sounds like a good idea, but I had wanted to limit the number
> of additional modules needed and the Pdf2html filter was my model.
Archive::Zip is available as a Windows perl package, which might be
easier for Windows users to install than the binary.
http://ppm.activestate.com/BuildStatus/5.8.html
What I'd probably do is use Archive::Zip if available otherwise fall
back to unzip.
The pdf filter uses the binary because there's is no library and
associated perl module. I discussed creating a library and module
with the author, but he was not ready to allow that at the time, IIRC.
> There is actually someone already working on an OpenDocument to XHTML
> conversion using XSLT:
> http://books.evc-cit.info/odf_utils/odt_to_xhtml.html
>
> Converting to XHTML or using the XML parser to extract or rewrite certain
> fields seems a lot more work than using aliases in the swish config file
> to map the tag names. However, mapping means that the config file has to
> be set up correctly.
Probably should be two filters. One for just creating an html-like
document and one like yours for people that want the raw document for
more control when indexing.
Thanks,
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Thu Nov 17 04:52:48 2005