[JDEV] look for help about unicode in jabber system

Dave dave at dave.tj
Fri Aug 16 16:50:48 CDT 2002


C doesn't require NULL-terminated strings.  It's just that the standard
C string library assumes that strings end in NULL (since that method's
proven to be very effective for many applications).  There are plenty
of enumerated-string libraries for C, and because strings aren't built
into the language, those libraries can be every bit as efficient as
the standard C routines (but then again, PASCAL people don't really
care much about efficiency, anyway ... if they did, they wouldn't be
PASCAL programmers, now, would they?).  If anything, one of C's sons
(that bastard created by Mr. Stroustrup) makes it rediculously easy
to use Unicode in the full UCS-32 format (or any of the other formats,
for that matter), by creating a new character data type, and using the
should've-been-in-STL basic_string template with that new UCS32Char type.
If you'd prefer to avoid leaving C (a very wise choice, IMHO), you can
use a wchar_t array ... or you can just stick with the extraordinarily
simple (and very compatible) UTF-8 :-)

As for alignment of structure elements, anything like that is guaranteed
to cause portability headaches.  If you really want to do it in C, you can
either fake it using character arrays, or use an inline assembly block.
Be aware that neither C nor PASCAL provides sufficient portability
when you try to do that kind of stuff, because that requirement by
definition violates any hopes of portability (which is not necessarily
bad, but it's worth considering nonetheless).  Also, the primary reason
for system-dependent alignment is efficiency.  If your 64-bit CPU has
to fetch two seperate 64-bit words just to get a 2-bit value, you're
losing lots of potential speed.

 - Dave


Timothy Carpenter wrote:
> 
> I do not think CHAR to UNICODE is the answer. CHAR is 8 bit, but UTF-8 is a
> way of sending UNICODE without breaking 'text' streams with data that looks
> like CR, LF EOF EOLN etc etc. RCSU is also another mechanism that is very
> intelligent use of packing, processing and compromising between ASCII and
> full 16-bit character sets, but I cannot recall if this protects text stream
> handlers from shocks. UTF-8 is less compact, but simpler, with no sliding
> windows.
> 
> To convert is not a huge task, to my memory - just a little masking and bit
> shuffling...shame no one uses PASCAL, as apart from not using <NULL> end
> tags for strings (yeah!), you can define structures to have conditional
> contents nailed down to the bit position, and even crossing
> byte/word/longword boundaries. Thus the data slots in without too much math
> nonsense all over the place.
> 
> Maybe this is why many C programmers quail at the thought of binary
> bit-packed headers and say they are unmaintainable. They probably are...in
> C. ;-)
> 
> Tim
> 
> On 17/08/2002 12:38 pm, "ÕÅ Æé" <jabberjaist at hotmail.com> wrote:
> 
> > do the jabber system support to east aisa GLYPH images,chinese ,japanese
> > and korea.I want
> > my jabber server support to unicode of east aisa.but I get a trouble. my
> > friend tell me.
> > just below ,is it right ,or have a better way to resolve the problem.
> > 
> > 
> > 6¦1Jabber uses UTF-8 encoding
> > 6¦1We have not been facing any problems because we have been operating in the
> > ASCII domain which is a subset of UTF-8.
> > 6¦1We need to find some kind of encoding algorithm/API which converts Unicode
> > to UTF-8 before we send out strings to the server and some kind of decoding
> > Algorithm/API which does the opposite when we receive strings.
> > 6¦1We need some kind of rendering mechanism has to make the mapping from
> > unicode to the actual character.
> > 6¦1
> > 
> > 6¦1There are a couple of Microsoft APIs called MultiByteToWideChar and
> > CharToMultiByte
> > 6¦1There is an Mlang API of Microsoft which has functions like
> > ConvertStringToUnicode and ConvertUnicodeToString (I think this is our best
> > bet. If we read this thoroughly we might be able to solve the problem)
> > 
> > 
> > 
> > _______________________________________________
> > jdev mailing list
> > jdev at jabber.org
> > http://mailman.jabber.org/listinfo/jdev
> 
> __________________________________________________
> Do You Yahoo!?
> Everything you'll ever need on one web page
> from News and Sport to Email and Music Charts
> http://uk.my.yahoo.com
> _______________________________________________
> jdev mailing list
> jdev at jabber.org
> http://mailman.jabber.org/listinfo/jdev
> 




More information about the JDev mailing list