[jdev] Re: VTD-XML version 1.6

Jimmy Zhang crackeur at comcast.net
Sat May 20 13:22:02 CDT 2006


Yes, explicit framing is to ensure the data integrity, which is key
to most apps... if an IP packet is not well-formed, then router will
reject it outright, I think XML will become IP for the message world...
----- Original Message ----- 
From: "Dave Cridland" <dave at cridland.net>
To: "Jabber software development list" <jdev at jabber.org>
Sent: Saturday, May 20, 2006 2:19 AM
Subject: Re: [jdev] Re: VTD-XML version 1.6


> On Sat May 20 05:56:19 2006, Justin Karneges wrote:
>> On Friday 19 May 2006 20:39, Peter Saint-Andre wrote:
>> > But it turns out that streaming XML has some inherent benefits, 
>> one of
>> > which is that you don't have to create a new parser instance 
>> every time
>> > you want to send, receive, or route a message.
>> 
>> More importantly, XMPP-specific parsing code doesn't need to be 
>> written.  Any other wire protocol would require writing a parser, 
>> but with XMPP you can just throw SAX at it.
>> 
>> 
> Ah, you see I approached XMPP looking for the framing for the 
> messages, because every other protocol I deal with has explicit 
> framing for the messages.
> 
> So, I do string matches to pull out the stanzas, and turn them into 
> complete XML documents by wrapping them in the real <stream> and 
> faked </stream>, and use DOM on the resultant docs. In other words, I 
> treat them as framed messages to pull out and parse, where the 
> framing depends on the opening bytes (up to the first space or >). 
> Maybe I'm weird, but it seems to work well. :-)
> 
> There's a potential problem where you end up finding a closing tag 
> that's actually not closing the stanza, because of namespace 
> redefinitions or whatever, but that's relatively easy to deal with, 
> you just find the next candidate end-of-stanza tag. You get similar 
> problems if you want to isolate messages in IMAP, too, where the 
> framing changes depending on the type of message.
> 
> My favourite benefit to XML streams over XML messages, though, is 
> that namespace declarations can be moved out of the messages and into 
> the root element. That's very cool for octet-obsessives like me.
> 
> (For compression people: Although moving the namespace declarations 
> further toward the root of the document tree to remove repetitions is 
> simply a representational change, the longevity of the impact 
> relative to the stream is large, so you tend to run out of the 
> reference length limit for Ziv/Lempel type compressions, and the 
> namespace strings themselves are sufficiently long that statistical 
> modelling compression algorithms won't have a good enough effect. 
> Also, because the namespace declaration strings tend to be 
> self-similar, putting them all together makes them compress better, 
> too.)
> 
> 
>> Granted, I'm also one of those guys that "wouldn't have designed it 
>> that way", but I still think XML streams are cool in that geeky 
>> sort of way.  Look mom, no parser.
>> 
>> 
> I think I probably would have gone for explicit framing, but I put 
> that down to reflex rather than any particularly sound principles. I 
> treat the data as if it does have explicit framing anyway, so it 
> doesn't actually really matter, and different parsing techniques mean 
> that there's advantage in letting the XML do the framing for you in 
> the protocol.
> 
> 
>> I agree with Peter though, talking about the rationale in 2006 is 
>> kind of pointless.
> 
> Well, it's pointless from the point of view of XMPP, certainly, but 
> it's interesting from a more philosophical protocol design kind of 
> way. Which could be pointless, but may not be.
> 
> Dave.
> -- 
> Dave Cridland - mailto:dave at cridland.net - xmpp:dwd at jabber.org
>  - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
>  - http://dave.cridland.net/
> Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
>




More information about the JDev mailing list