[JDEV] utf8
Glen
jdev at empireenterprises.com
Thu Dec 4 10:48:49 CST 2003
Dude!
Finally figured out this utf8 perl nonsense.
As it turns out, perl will use utf8 by default in it's strings; however,
there is a utf8 "flag" on each variable that is not turned on by
default.
I was using the _is_utf8 function in the Encode module to test whether
the string was utf8, but this only checks for the flag, which has to be
explicitly set. >:(
Basically, since I'm getting my content through LWP, I'm checking the
content for the character set. I search for /charset=UTF8/, if it
doesn't exist, I convert to UTF8 using the Encode module:
use Encode qw(encode);
my $string = encode("utf", $string);
My code was previously crashing as well, whenever it received a funky
character that iso-8559-1 didn't recognize, but this has taken care of
it.
Hope this helps you...
-g
--search keywords:
utf8
perl
fixed
help
--
On Wed, 2003-12-03 at 22:22, Jeremy Nickurak wrote:
> > On Mon, 2003-12-01 at 20:48, jdev at empireenterprises.com wrote:
> > I found the Encode module, which includes utf8 checking function, "is_utf8".
> > According to this, my utf8 conversion functions are not working properly, as
> > is_utf8 is always returning false whenever I get content from LWP::UserAgent.
> >
> >
> > I've tried using both Unicode::MapUTF8 & Encode modules, to no avail. I'll keep
> > looking for perl utf8 information.
> >
> > -g
> >
> > Quoting Nicholas Perez <nick at jabberstudio.org>:
> >
> > > Depending on your Perl version, all strings should already be unicode
> > > enabled. You should `man perluniintro` or `man perlunicode` for further
> > > information.
> > >
> > >
> > > Glen wrote:
> > >
> > > >Hmm.
> > > >Any ideas on how I would determine whether a string is UTF-8 encoded or
> > > >not?
> > > >
> > > >-g
> > > >
> > > >
> > > >
> > > >On Mon, 2003-12-01 at 18:19, Justin Karneges wrote:
> > > >
> > > >
> > > >>Make sure you don't double-encode your data. Your XML library probably
> > > >>supports unicode already, and so there should be no need to explicitly
> > > encode
> > > >>anything yourself.
> > > >>
> > > >>-Justin
> > > >>
> > > >>On Monday 01 December 2003 02:27 pm, Glen wrote:
> > > >>
> > > >>
> > > >>>general public,
> > > >>>
> > > >>>I'm attempting to send multiple languages in a jabber message.
> > > >>>I'm using Net::Jabber to send, & I'm encoding content into UTF-8 with
> > > >>>Unicode::MapUTF8; however, I'm receiving gibberish in the client.
> > > >>>
> > > >>>I don't know much about Unicode, but from what I understand, there isn't
> > > >>>much to it. My client (PSI on linux) supposedly supports UTF-8 - is
> > > >>>there something that I'm missing, or is there a direction anyone can
> > > >>>point me in?
> > > >>>
> > > >>>-g
> > > >>>
> > > >>>
> > > >>>
> > > >>>_______________________________________________
> > > >>>jdev mailing list
> > > >>>jdev at jabber.org
> > > >>>http://mailman.jabber.org/listinfo/jdev
> > > >>>
> > > >>>
> > > >>_______________________________________________
> > > >>jdev mailing list
> > > >>jdev at jabber.org
> > > >>http://mailman.jabber.org/listinfo/jdev
> > > >>
> > > >>
> > > >
> > > >_______________________________________________
> > > >jdev mailing list
> > > >jdev at jabber.org
> > > >http://mailman.jabber.org/listinfo/jdev
> > > >
> > > >
> > > >
> > >
> > > _______________________________________________
> > > jdev mailing list
> > > jdev at jabber.org
> > > http://mailman.jabber.org/listinfo/jdev
> > >
> >
> >
> >
> >
> > ----------------------------------------------------------------
> > This message was sent using IMP, the Internet Messaging Program.
> > _______________________________________________
> > jdev mailing list
> > jdev at jabber.org
> > http://mailman.jabber.org/listinfo/jdev
>
> I had no end of problems with UTF8 in perl writing janchor. I never did
> find any solutions, unfortunately. If you ever do find a solution, I'd
> be very interested in hearing it, as it's still a constant problem that
> causes crashes frequently.
More information about the JDev
mailing list