[jdev] Trouble sending greek characters (and I guess other languages) in Jabber messages (with Perl's Net::Jabber)

Peter Saint-Andre stpeter at jabber.org
Fri Sep 30 10:04:43 CDT 2005


I don't know about your little Perl script or Net::Jabber, but the 
Jabber/XMPP transport level has always been pure UTF-8. If you send 
UTF-8 characters, all the servers will transport them correctly. Whether 
the receiving client shows them correctly depends on how full that 
client's Unicode support it.

Peter

John Talbot wrote:
> How is it that Jabber clients can send greek characters to other Jabber 
> clients and they display nicely? They do, somehow.
> 
> But when I try to get my perl script to send a greek character to a 
> Jabber server (and I believe I've tried everything everything under the 
> sun - details below), all I get on my Jabber client is weird accented 
> characters from the upper 8859-1 charset.
> 
> Here's what I tried, and you'll see the paradox:
> 
> Using a windows jabber client's raw XML entry (from an admin account), I 
> wrote:
> 
> SENT: <message to="192.168.1.100/announce/motd"><body>Ya sou (but in 
> greek letters)</body></message>
> 
> This makes the client show an announcement in perfectly readable Greek 
> characters.
> 
> However when I asked my Perl script to send the exact same command (with 
> "Ya sou" written both in ISO-GREEK and in UTF-8, two cases), then only 
> accented western european vowels appeared in the notification. In the 
> ISO-GREEK case, I got the same number of western accented letters as the 
> number of greek letters in Ya sou, in the UTF-8 case, I got double the 
> western letters.
> 
> This seems rather strange... because if during Pandion's raw XML entry, 
> the letters of "Ya sou" were sent neither in ISO-GREEK nor in UTF-8, 
> then how WERE they sent?
> 
> [Pause]
> 
> I just read a bit of the core XMPP protocol, it says an 
> xml:lang='language-code' attribute should be used right after connection...
> 
> I'm using the Net::Jabber library for my Perl scripts. Could it be that 
> the library doesn't use xml:lang? (I have no way to check) Even if so, I 
> tried including an xml:lang='el' (that's the symbol for greek) attribute 
> in the <message> tag that Perl sends, and that didn't change a thing 
> (even though according to the core XMPP protocol it should have worked, 
> if that were the problem) neither with the UTF-8 version of the string 
> nor the ISO-GREEK one.
> 
> What do you think might be going on?
> 
> I could send you a trimmed version of the script if you need.
> 
> Many thanks,
> John
> 


-- 
Peter Saint-Andre
Jabber Software Foundation
http://www.jabber.org/people/stpeter.shtml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3511 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://www.jabber.org/jdev/attachments/20050930/2c8d6a5f/attachment-0002.bin>


More information about the JDev mailing list