It's pretty easy to do such a thing... sounds like the main difference between wvText and catdoc is that wvText creates an output file, whereas catdoc prints the output.
Start with the Doc2txt.pm filter, and change this part:
# Grab output from running program
my $content = $filter->run_program( 'catdoc', $file );
return unless $content;
to something like this (I can't test this code right now but hopefully it will work):
$filter->run_program('/path/to/wvText', $file, '/path/to/temp/output/file');
open CONVERTED, '/path/to/temp/output/file' or die "couldn't open '/path/to/temp/output/file' for reading - $!";
$$content = join '',<CONVERTED>;
>>> "Roubart Capcap" <RCapcap@scif.com> 07/23/03 08:12AM >>>
I am having problems using the swish filter with doc2txt and catdoc to filter MSWord documents containing "forms". I found a program wvText and wvHtml which can do the conversion, however, I do not know how to create the filter for it. Basically, the wvText requires the input file name and output file name. I am not a perl programmer so I would appreciate any help you could give.
Thank you for a very fast and flexible swish-e!!
Received on Wed Jul 23 17:21:50 2003