[jdev] BOM

Justin Karneges justin-keyword-jabber.093179 at affinix.com
Thu Nov 6 16:15:34 CST 2008


On Thursday 06 November 2008 12:49:30 Jonathan Dickinson wrote:
> Much obliged. As a case of interopability, maybe something like: entities
> MUST NOT send byte order marks, however, they MUST tolerate them.

As I understand it, BOM are used only for UTF-16.  At the XML layer in 
Psi/Iris, I believe we support any text encoding, and if UTF-16 is detected 
(via presence of BOM) then the BOM are honored.  The <?xml ...?> line is then 
read to confirm the encoding.  Of course, soon after that the stream is 
destroyed due to the attempt at using an encoding that isn't UTF-8. :)

I'm not sure what happens if BOM is present in the UTF-8 stream or what is 
supposed to happen.  My guess is that they should get ignored, but that it's 
also wrong to be putting them in there in the first place.  So I think your 
proposed text is what we want, but perhaps an encoding guru should 
double-check..

-Justin



More information about the JDev mailing list