I'm still experimenting with ways to get around my Open2 problem ( "
open2: IO::Pipe: Can't spawn-NOWAIT:..." error after reading 64 files).
To recap, waitid didn't solve the problem in Filter.pm for me. I sometimes
get the same NOWAIT error, after reading 65 files (improvement of 1), but
it is erratic - mostly the command prompt session freezes.
However, it did work in this test script (reading a 1000 files called
0000.doc to 0999.doc - collected from my C: drive and copied into a
for (my $k = 0; $k < 1000; $k++)
my $filename = "c:/cat/".substr("0000$k",-4).".doc";
#my $command = "c:\\data\\swish\\catdoc\\catdoc.exe $filename"; #
my $command = "c:\\progra~1\\swish-e\\lib\\swish-e\\catdoc.exe
$filename"; # Dave's version(?)
my $pid = IPC::Open2::open2($rdrfh, $wtrfh, "$command" );
binmode $rdrfh, ':crlf';
$/ = undef;
my $content = <$rdrfh>;
my $mtime = (stat $filename);
my $size = length $content;
Except for a number of particular files. I now seem to be getting tangled
up in catdoc/Win32 issues.
I tried two version of catdoc..
The first was the one which came with Swish-e 2.4. (Sounds like Dave did
some good work with this to get it to read long file names.) Unfortunately
for a few of my Word documents it produced only a string of question marks
- when run from the command line. Or sometimes some text, and then a string
of question marks. When called while indexing, it seemed to cause swish-e
to hang.. (on one of these files).
I downloaded V.93.3 of catdoc from
This seemed to work better (except, it couldn't handle long filenames). And
it couldn't handle 10 of my files - giving a "Bad BBD entry!" error and
freezing (in 9 out of 10 cases). The files it didn't work on were large
files (20MB) with lots of jpg included (the staff newsletter!).
I guess I just battle on.(I am getting around the long filenames by copying
the file somewhere else first, and I have a list of files, now, that I
will ignore...) Any suggestions appreciated.. (eg a way to trap errors
This email message and any accompanying attachments may contain
information that is confidential and is subject to legal privilege. If you are not
the intended recipient, do not read, use, disseminate, distribute or copy this
message or attachments. If you have received this message in error, please
notify the sender immediately and delete this message. Any views expressed
in this message are those of the individual sender, except where the sender
expressly, and with authority, states them to be the views of AMP. Before
opening any attachments, please check them for viruses and defects.
Received on Thu Jan 22 21:00:02 2004