[jdev] Re: VTD-XML version 1.6
Jimmy Zhang
crackeur at comcast.net
Sat May 20 13:22:02 CDT 2006
Yes, explicit framing is to ensure the data integrity, which is key
to most apps... if an IP packet is not well-formed, then router will
reject it outright, I think XML will become IP for the message world...
----- Original Message -----
From: "Dave Cridland" <dave at cridland.net>
To: "Jabber software development list" <jdev at jabber.org>
Sent: Saturday, May 20, 2006 2:19 AM
Subject: Re: [jdev] Re: VTD-XML version 1.6
> On Sat May 20 05:56:19 2006, Justin Karneges wrote:
>> On Friday 19 May 2006 20:39, Peter Saint-Andre wrote:
>> > But it turns out that streaming XML has some inherent benefits,
>> one of
>> > which is that you don't have to create a new parser instance
>> every time
>> > you want to send, receive, or route a message.
>>
>> More importantly, XMPP-specific parsing code doesn't need to be
>> written. Any other wire protocol would require writing a parser,
>> but with XMPP you can just throw SAX at it.
>>
>>
> Ah, you see I approached XMPP looking for the framing for the
> messages, because every other protocol I deal with has explicit
> framing for the messages.
>
> So, I do string matches to pull out the stanzas, and turn them into
> complete XML documents by wrapping them in the real <stream> and
> faked </stream>, and use DOM on the resultant docs. In other words, I
> treat them as framed messages to pull out and parse, where the
> framing depends on the opening bytes (up to the first space or >).
> Maybe I'm weird, but it seems to work well. :-)
>
> There's a potential problem where you end up finding a closing tag
> that's actually not closing the stanza, because of namespace
> redefinitions or whatever, but that's relatively easy to deal with,
> you just find the next candidate end-of-stanza tag. You get similar
> problems if you want to isolate messages in IMAP, too, where the
> framing changes depending on the type of message.
>
> My favourite benefit to XML streams over XML messages, though, is
> that namespace declarations can be moved out of the messages and into
> the root element. That's very cool for octet-obsessives like me.
>
> (For compression people: Although moving the namespace declarations
> further toward the root of the document tree to remove repetitions is
> simply a representational change, the longevity of the impact
> relative to the stream is large, so you tend to run out of the
> reference length limit for Ziv/Lempel type compressions, and the
> namespace strings themselves are sufficiently long that statistical
> modelling compression algorithms won't have a good enough effect.
> Also, because the namespace declaration strings tend to be
> self-similar, putting them all together makes them compress better,
> too.)
>
>
>> Granted, I'm also one of those guys that "wouldn't have designed it
>> that way", but I still think XML streams are cool in that geeky
>> sort of way. Look mom, no parser.
>>
>>
> I think I probably would have gone for explicit framing, but I put
> that down to reflex rather than any particularly sound principles. I
> treat the data as if it does have explicit framing anyway, so it
> doesn't actually really matter, and different parsing techniques mean
> that there's advantage in letting the XML do the framing for you in
> the protocol.
>
>
>> I agree with Peter though, talking about the rationale in 2006 is
>> kind of pointless.
>
> Well, it's pointless from the point of view of XMPP, certainly, but
> it's interesting from a more philosophical protocol design kind of
> way. Which could be pointless, but may not be.
>
> Dave.
> --
> Dave Cridland - mailto:dave at cridland.net - xmpp:dwd at jabber.org
> - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
> - http://dave.cridland.net/
> Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
>
More information about the JDev
mailing list