[jdev] Using german umlauts and other special characters in jabber client

Matthias Wimmer m at tthias.net
Fri Oct 15 04:22:23 CDT 2004


Hi Jana!

Jana von dem Berge schrieb am 2004-10-15 10:58:44:
> I'm running jabberd 1.4.3 and I have Problems with german umlauts. 
> When I send a message with a Client to the server with e.g. an 'ü' I see the 'ü' in the debugging outputs of the server.
> But when I read from the socket with 
> 
> recv(j->fd, buf, sizeof(buf)-1, 0); (language c)
> 
> the umlaut is know an 'ü' and other special characters like 'à' act the same way.
> 
> Do you think I have to configurate my jabebr-server or do I replace all these characters in my Client-C-Programm with the
> right characters?

This are not strange characters ;) It's just that XMPP/Jabber does not
use the Latin-1/ISO-8859-1 charset you are used to use (and probably
using in your applications) but the UTF-8 encoding of Unicode. Therefore
XMPP is able to carry "all" characters not only a limited subset of
(less than) 256 characters you have in Latin-1.

If you want to stick on using Latin-1 as your local charset (*),
take care, that you convert the characters from your local charset to
UTF-8 before transmitting and to convert them from UTF-8 back to your
local charset after receiving.

If you declare another encoding than UTF-8 in the xml declaration using
the encoding attribute, the jabberd 1.4.x implementation and many others
will detect that you are using an other charset and will convert the
incoming data to UTF-8 ... but as you noticed on the outgoing stream
jabberd does always use the UTF-8 encoding.
That jabberd accepts other encodings than UTF-8 is a feature available
in the XML parser used by jabberd (and other servers), but you should
not rely on this, as it is not neccessary by the XMPP standard to accept
other encodings than UTF-8.

If you are working on a unix plattform, you can use the function iconv()
(man 3 iconv) to convert between different charsets. I don't know, if
this function exists on (native) windows as well.


Tot kijk
    Matthias


(*) you might also consider changing to using unicode in your
application by using the wchar_t character type instead of the char one.

-- 
Fon: +49-(0)70 0770 07770       http://web.amessage.info
HAM: DB1MW                      xmpp:mawis at amessage.info
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <https://www.jabber.org/jdev/attachments/20041015/a2add49c/attachment-0002.pgp>


More information about the JDev mailing list