[JDEV] charsets (was: Protocol extension?)

Anders Qvist quest at netg.se
Thu Jul 29 02:29:12 CDT 1999


On Wed, 28 Jul 1999, Jeremie wrote:

> Reguarding the whole charset discussion... this is an issue that I haven't
> dealt with yet simply becuase I've had no experience with other character
> sets.
> 
> I do know that XML is very closely related to unicode, but I'm not too
> familiar with unicode.  Would it be suggested to simply standardize on one
> charset, be that unicode or one of the ISO* charsets?  I'm open to
> suggestions, but want to keep the client restrictions to a minimum(such as
> they aren't forced to understand multiple character sets).

Problem is, charsets change every once in a while. For example,
ISO-Latin-8859-1 does not contain the euro symbol used for European
Union Monetary Union currency. For this purpose there is a new
character set (can't remember the iso code). This is probably gonna
happen every few years.

As to the argument about keeping client restrictions to a minimum, as
a swede, I'd say a program such as a jabber client is not working if I
cant use national swedish characters (mainly åäö) when communicating
with other swedes. 

Anyway, you dont need to force the clients to understand charsets, but
the jaber message format must be able to convey which character set
the message is written in. Personally, I'd prefer if ALL messages were
marked with character set, even if it is "the default one".

Sorry to jump on you like this. Most European countries have some
national characters not enclosed in us-ascii and it is frustrating
having those issues continually marginalised by US developers. I can
only imagine how the Japanese feel about it.

If there is anything I can do to help in the matter, please tell me.
 
> Jer
> 
> On Wed, 28 Jul 1999 Lindsay.Marshall at newcastle.ac.uk wrote:
> 
> > 
> > >   Hrm, I see your point..  What about designating that the only valid char set for use within jabber
> > > be either iso or unicode?  I really haven't delved into using mutliple char sets within ANY programs,
> > > so your most likely ahead of me as far as knowledge of the actual requirments are concerned..
> > 
> > Well, in tcl it is completely trivial to work with multiple character
> > set encodings as all the support is built in already - if you know th
> > e encoding you translate it to unicode straight away and back - but I
> > dont know about any other systems. 
> > 
> > L.
> > -- 
> > http://catless.ncl.ac.uk/Lindsay
> > 
> 
> 

Anders "Quest" Qvist
NetGuide Scandinavia

-- Why suffer scarcity? Look for the Open Source and enter a world of plenty!




More information about the JDev mailing list