BTW -- Here's an example at looking at what the perl variables contain.
It also shows how perl 5.6.1 is broken. It takes a UTF-8 character and
splits it using a regular expression containing an 8-bit character (not
flagged as UTF-8).
#!perl -w
use strict;
use Devel::Peek;
my $x = "\x{263A}";
Dump($x);
my $y = chr( 128+24 );
Dump($y);
print "\nsplit..\n\n";
my @foo = split /$y/, $x;
print "Split into ", scalar @foo, " scalars\n";
print "\nFirst element:\n";
Dump( $foo[0] );
print "\nSecond element\n";
Dump( $foo[1]);
print "Now try to print\n$foo[0]\n";
..Run this with 5.6.1 and you get:
SV = PV(0x80f6344) at 0x80fd444
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK,UTF8)
PV = 0x80f9e58 "\342\230\272"\0 <<< there's the UTF-8 char.
CUR = 3
LEN = 4
SV = PV(0x80f63b0) at 0x80fd414
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x80fe300 "\230"\0 <<< non-utf
CUR = 1
LEN = 2
split..
Split into 2 scalars
First element:
SV = PV(0x80f64d0) at 0x80fd3c0
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0x8107168 "\342"\0 <<< broken character
CUR = 1
LEN = 2
Second element
SV = PV(0x80f6494) at 0x8115fa4
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0x8106e58 "\272"\0 <<< same here.
CUR = 1
LEN = 2
Now try to print
Perl 5.8 doesn't do this.
Maybe this will help you debug.
--
Bill Moseley moseley@hank.org
Received on Sat Mar 1 06:23:12 2003