Skip to main content.
home | support | download

Back to List Archive

Re: Catdoc - following on from the Open2 problems i

From: <Allan_Watts(at)not-real.amp.com.au>
Date: Sun Mar 28 2004 - 22:21:54 GMT
I had most success (under Win 2000) with the "orginal" version of catdoc (
http://www.45.free.net/~vitus/ice/catdoc/) and using perl

my $shortname = Win32::GetShortPathName($filename);

to get around the long file name problem.  Even then, there were a small
number of MS-Word documents that caused catdoc to hang , with "Bad BBD
entry!" error message.  (I excluded these files explicitly just to get
something working.)

Can't remember now what the exact problems were with wvWare (just as likely
to be me!).

(Apologies for taking so long to get to this.)

Allan






David L Norris <dave@webaugur.com>@sunsite.berkeley.edu on 24/03/2004
02:54:35 PM

Please respond to dave@webaugur.com

Sent by:    swish-e@sunsite.berkeley.edu


To:    Multiple recipients of list <swish-e@sunsite.berkeley.edu>
cc:
Subject:    [SWISH-E] Re: Catdoc - following on from the Open2 problems i


On Wed, 2004-03-24 at 02:20, Ahmad, Zeeshan (FMC) wrote:
> I am trying to index word documents on windows using swish-e 2.4. Althoug
=
h
> some documents get indexed, for others Dr. Watson reports an access
> violation in catdoc.exe.=20

That's not surprising.  The version included with SWISH-E is an
unofficial port I made just to support long filenames.  wvWare is
probably the better choice since it's actively supported on Windows by
the upstream maintainers.  Also, wvWare is used by Abiword as the Word
document importer.  catdoc's maintainer indicated he has no interest in
supporting Windows.

> Is this wvWare feature ready for production use on Windows by any chance?

Well, all I can say is to test it with your documents.  I don't use
Windows at all beyond testing.  In my tests, wvWare works better than
catdoc and correctly converted all of the documents I had available.=20
catdoc failed to decode various text objects (links, etc) in numerous
documents (on Windows and UNIX).  I've not seen catdoc crash however it
seems likely it could crash when reading documents with unexpected OLE
objects or markup.  And my catdoc port may crash for any number of other
reasons.

--=20
 David Norris
  http://www.webaugur.com/dave/
  ICQ - 412039



*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************







This email message and any accompanying attachments may contain
information that is confidential and is subject to legal privilege. If you are not
the intended recipient, do not read, use, disseminate, distribute or copy this 
message or attachments. If you have received this message in error, please 
notify the sender immediately and delete this message. Any views expressed
in this message are those of the individual sender, except where the sender
expressly, and with authority, states them to be the views of AMP. Before 
opening any attachments, please check them for viruses and defects.
Received on Sun Mar 28 14:21:55 2004