[JDEV] Writings from the Journal of TCharron

Jon A. Cruz joncruz at geocities.com
Thu Aug 5 12:28:53 CDT 1999


"Jon A. Cruz" wrote:

> One example is if a document contains an encoding that is not recognized by the
> parser. Since the encoding declarations are just plain-text labels, the parser
> might not recognize some encodings even if they are support. In any case, if the
> parser hits an unrecognized encoding, it can't handle the rest of the document,
> and would need to throw an exception. This can be worked around by some form of
> content negotiation, but that has problems also.

I forgot an example I was going to include.

A few years ago my company ran into a problem with web browsers. We had a catalog
product with a web server front end, and lots of Asian clients. At one point we were
getting quite frustrated as we could not get the browsers to consistently shift into
Korean when we needed to. It turns out that Microsoft/Spyglass goofed up with 3.0,
and had a different encoding name they were looking for to recognize Korean. Problem
is the MS/SG guys thought they picked the right name (it was registered and all) but
didn't. The name they recognized was one that really represented content that was
Korean only, and not mixed Korean/ASCII characters (even though the Korean code
points were the same).

Another hint is from the encodings supported by Microsoft's Java VM, or rather not.
It's based on Sun's VM, but Sun's VM supports a vastly larger collection of
encodings. Given that Microsoft really limited the encodings supported there, and
their prior history, one has to be very careful.

Also, on Windows, Microsoft's VM is incapable of displaying other than the default
charset, whereas Sun's VM on the exact same machines has no problems displaying
English, Korean, Chinese, Japanese, etc.

Also, even though it's pure Unicode internally, WindowsCE does not have system-level
support for UTF-8 conversions, even though the rest of Win32 does. Oh, what fun.

(This is not intended as a bunch of Microsoft bashing, but rather to point out some
of the problems that might face programmers trying to implement things on Windows if
different encodings need to be handled)

--
"My new computer's got the clocks, it rocks
But it was obsolete before I opened the box" - W.A.Y.







More information about the JDev mailing list