Re: [swish-e] utf8 again

From: Peter Karman <peter(at)>
Date: Mon Aug 11 2008 - 16:18:09 GMT
On 08/08/2008 04:49 PM, Michael Peters wrote:
> Brad Miele wrote:
>> not sure if this helps, but what we do is:
> Mine is simpler and just 1 line:
> $buffer =~ s/([^\p{IsASCII}])/sprintf('&amp;#x%X;', ord($1))/ge;

I wrote:

for just such cases as needing to store UTF-8 encoded text as a Swish-e Property.

I think \p{IsASCII} requires the double encoding of & -> &amp; because \p works on
characters, not bytes. It'll work (the double-encoding approach) just as well as the
Search::Tools hack does, but for different reasons.

Peter Karman  .  peter(at)  .

