Skip to main content.
home | support | download

Back to List Archive

Re: Detecting multibyte/wide characters?

From: J Robinson <jrobinson852(at)not-real.yahoo.com>
Date: Tue Feb 22 2005 - 14:13:12 GMT
Hello All:

This is a follow up on this topic from October of
2004. I tried the string_has_multibyte_chars()
suggestion below, but in my testing it never signals
that a string has wide chars. Maybe I should rephrase
my question:

- Indexing text in (say) the "en_US.UTF-8" LANG
locale, how do I detect if a page is going to result
in an error like

 input conversion failed due to input error
 Bytes: 0xB5 0x74 0xA3 0xBA

When indexing? I'd like to skip such pages.

And what happens when an input conversion fails
anyway? Is the part before the error indexed, or is
the whole page ignored?

Thanks in advance,
  jrobinson


--- J Robinson <jrobinson852@yahoo.com> wrote:

> Thanks for the useful responses, Bill and friedfish!
> jrobinson
> 
> --- Bill Schell <friedfish@optonline.net> wrote:
> 
> > We prefer to be called swisheans. :-)
> > 
> > This function should do what you want, assuming
> you
> > want
> > to know if you have any multibyte characters in a
> > string.
> > 
> > use bytes();
> > 
> > sub string_has_multibyte_chars {
> >     my ($string) = @_;
> >     return 1 if (length($string)  <
> > bytes::length(string));
> >     return 0;
> > }
> > 
> > Note that you must have the 'use' line exactly
> that
> > way.
> > If you just say 'use bytes', the all of your calls
> > to length
> > will be to bytes::length in the current lexical
> > scope.
> > 
> > 
> > J Robinson wrote:
> > 
> > >Hello swisheites:
> > >
> > >I have a question related to swish-e, if someone
> in
> > >this knowledgable group might know the answer:
> > >
> > >Suppose I have word in a perl scalar ($w). 
> > >
> > >How can I detect if $w contains multibyte or
> 'wide'
> > >characters?  
> > >
> > >Thanks in advance if anyone knows. I suppose this
> > >might be in a faq somewhere but I couldn't find
> it.
> > >Thanks again.
> > >jrobinson

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
Received on Tue Feb 22 06:13:19 2005