[JDEV] [INFO] i18n? (fwd)

Scott Robinson quad at jabber.org
Sun Jan 2 18:17:46 CST 2000


Oh lordie. First off, before I begin to enter into our internationalization
debate AGAIN, I would suggest for anyone curious about non-UTF* support, I
woudl suggest heading to the jdev archives from September in the
charset/encoding thread.

--

To make the answer short, the issue has been partially resolved.

At the protocol level: Expat, and by extension, xmlnode supports UTF-8 and
UTF-16. However, all our code assumes 8-bit characters, which won't help the
moment we start screwing around with UTF-16 or Unicode I can see some
serious parsing problems. ;)

At the message level: MIME extensions has created a solution to encoding
issue in that placing the proper headers should tell any MIME-intelligent
client that the message CDATA itself is encoded in a different character set
than the protocol stream.

Scott.

* Eliot Landrum translated into ASCII [Sun, Jan 02, 2000 at 05:52:58PM -0600][<Pine.LNX.3.96.1000102175155.15809F-100000 at lito.aspect.net>]
> Might someone have a more technical / authoritative answer than what I can
> give?
> 
> ---------- Forwarded message ----------
> Date: Thu, 30 Dec 1999 15:32:54 -0500
> From: Constantin Riabitsev <tech at nicodemusproject.com>
> To: info at jabber.org
> Subject: [INFO] i18n?
> 
> Hi guys!
> 
> Just found out about Jabber, spent all evening looking through the
> docs and DTD's and realized that there's no trace of any
> internationalization stuff. People communicate in more than one
> encoding, and I think it would be wise to incorporate the standard
> i18n features into the DTD's. You know, attributes like
> charset="koi8-r" or dir="ltr"...
> 
> I think they would be appropriate in jabber:iq:info section since
> most people don't change their encoding preferences very often, but
> just in case I decided to type up a message in an encoding other
> than my default one, there should be "charset" and "dir" attributes
> defined in the !ATTLIST for <message>.
> 
> Example for the jabber:iq:info query-response would be then:
> 
> <iq to="user at server.com" type="get">
>   <query xmlns="jabber:iq:info"><name/><email/><i18n/></query>
> </iq>
> 
> <iq from="user at server.com" type="result">
>   <query xmlns="jabber:iq:info">
> 	<name>John Doe</name>
> 	<email>john at doe.com</email>
> 	<i18n charset="us-ascii" dir="ltr"/>
>   </query>
> </iq>
> 
> This will tell the client that John Doe uses us-ascii and sends
> messages in left-to-right (I think it is safe to provide us-ascii
> and ltr as default settings in the DTD:
> 
> charset		#IMPLIED	"us-ascii"
> dir		#IMPLIED	"ltr"
> 
> ).
> 
> The reason why this is important is because there are sometimes
> several typeset standards for some language. E.g. Russian Cyrillic
> has two widespread standards -- win1251 (windows platforms) and
> koi8-r (*nix platforms) and it is sometimes impossible to use IM
> clients between these two unless the client can re-code from one
> into another.
> 
> Using the i18n parameters, the client will know which encoding the
> messages come in and it will be able to recode them (if this
> capability is built into it).
> 
> Example of an <iq> query reply:
> 
> <iq from="user at server.ru" type="result">
>   <query xmlns="jabber:iq:info">
> 	<name>Ivan Petrov</name>
> 	<email>petrov at server.ru</email>
> 	<i18n charset="win-1251" dir="ltr"/>
>   </query>
> </iq>
> 
> This will tell my Linux client that before I can understand what
> Ivan Petrov writes me, it will need to apply the win1251->koi8-r
> recoding routines.
> 
> Hope this is useful.. :)
> Let me know what you think about this idea.
> 
> Sincerely,
> -- 
> Konstantin Riabitsev,  
> Nicodemus Project Tech.
> Homines quod volunt credunt.
> 
> 
> 
> _______________________________________________
> jdev mailing list
> jdev at jabber.org
> http://mailman.jabber.org/listinfo/jdev
> 

-- 
jabber:quad at jabber.org         - Universal ID (www.jabber.org)
http://dsn.itgo.com/           - Personal webpage
robhome.dhis.org               - Home firewall

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GAT dpu s+: a--- C++ UL++++ P+ L+++ E- W+ N+ o+ K++ w++
O M V PS+ PE Y+ PGP++ t++ 5++ X+ R tv b++++ DI++++ D++
G+ e+ h! r-- y-
------END GEEK CODE BLOCK------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 240 bytes
Desc: not available
URL: <https://www.jabber.org/jdev/attachments/20000102/f529dbcd/attachment-0002.pgp>


More information about the JDev mailing list