On 09/13/2007 10:09 AM, Peter Karman wrote:
> I guess then the onus is on Perl to deal with mismatched encodings. It actually
> seems to do that reasonably well in some cases, horribly in others. The biggest
> issue I've seen is when it interprets bytes intended as UTF-8 as Latin1. There
> are cases where the same sequence of bytes is valid in both encodings, and Perl
> seems to assume Latin1 as the default.
I should add that as an example, look at my Search::Tools::UTF8 CPAN module.
IIRC, there are some tests in there that can be useful when un-raveling the
Perl UTF-8 maze.
Peter Karman . peter(at)not-real.peknet.com . http://peknet.com/
Users mailing list
Received on Thu Sep 13 11:39:00 2007