[jdev] BOM
Justin Karneges
justin-keyword-jabber.093179 at affinix.com
Thu Nov 6 16:15:34 CST 2008
On Thursday 06 November 2008 12:49:30 Jonathan Dickinson wrote:
> Much obliged. As a case of interopability, maybe something like: entities
> MUST NOT send byte order marks, however, they MUST tolerate them.
As I understand it, BOM are used only for UTF-16. At the XML layer in
Psi/Iris, I believe we support any text encoding, and if UTF-16 is detected
(via presence of BOM) then the BOM are honored. The <?xml ...?> line is then
read to confirm the encoding. Of course, soon after that the stream is
destroyed due to the attempt at using an encoding that isn't UTF-8. :)
I'm not sure what happens if BOM is present in the UTF-8 stream or what is
supposed to happen. My guess is that they should get ignored, but that it's
also wrong to be putting them in there in the first place. So I think your
proposed text is what we want, but perhaps an encoding guru should
double-check..
-Justin
More information about the JDev
mailing list