[jdev] Re: VTD-XML version 1.6
Dave Cridland
dave at cridland.net
Sat May 20 04:19:19 CDT 2006
On Sat May 20 05:56:19 2006, Justin Karneges wrote:
> On Friday 19 May 2006 20:39, Peter Saint-Andre wrote:
> > But it turns out that streaming XML has some inherent benefits,
> one of
> > which is that you don't have to create a new parser instance
> every time
> > you want to send, receive, or route a message.
>
> More importantly, XMPP-specific parsing code doesn't need to be
> written. Any other wire protocol would require writing a parser,
> but with XMPP you can just throw SAX at it.
>
>
Ah, you see I approached XMPP looking for the framing for the
messages, because every other protocol I deal with has explicit
framing for the messages.
So, I do string matches to pull out the stanzas, and turn them into
complete XML documents by wrapping them in the real <stream> and
faked </stream>, and use DOM on the resultant docs. In other words, I
treat them as framed messages to pull out and parse, where the
framing depends on the opening bytes (up to the first space or >).
Maybe I'm weird, but it seems to work well. :-)
There's a potential problem where you end up finding a closing tag
that's actually not closing the stanza, because of namespace
redefinitions or whatever, but that's relatively easy to deal with,
you just find the next candidate end-of-stanza tag. You get similar
problems if you want to isolate messages in IMAP, too, where the
framing changes depending on the type of message.
My favourite benefit to XML streams over XML messages, though, is
that namespace declarations can be moved out of the messages and into
the root element. That's very cool for octet-obsessives like me.
(For compression people: Although moving the namespace declarations
further toward the root of the document tree to remove repetitions is
simply a representational change, the longevity of the impact
relative to the stream is large, so you tend to run out of the
reference length limit for Ziv/Lempel type compressions, and the
namespace strings themselves are sufficiently long that statistical
modelling compression algorithms won't have a good enough effect.
Also, because the namespace declaration strings tend to be
self-similar, putting them all together makes them compress better,
too.)
> Granted, I'm also one of those guys that "wouldn't have designed it
> that way", but I still think XML streams are cool in that geeky
> sort of way. Look mom, no parser.
>
>
I think I probably would have gone for explicit framing, but I put
that down to reflex rather than any particularly sound principles. I
treat the data as if it does have explicit framing anyway, so it
doesn't actually really matter, and different parsing techniques mean
that there's advantage in letting the XML do the framing for you in
the protocol.
> I agree with Peter though, talking about the rationale in 2006 is
> kind of pointless.
Well, it's pointless from the point of view of XMPP, certainly, but
it's interesting from a more philosophical protocol design kind of
way. Which could be pointless, but may not be.
Dave.
--
Dave Cridland - mailto:dave at cridland.net - xmpp:dwd at jabber.org
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
More information about the JDev
mailing list