[JDEV] Binary XML useful for Jabber?
David Waite
dwaite at jabber.com
Wed May 23 01:27:36 CDT 2001
Jens Alfke wrote:
> David Waite <dwaite at jabber.com> wrote:
>
> > I don't think this last item is there. the LITERAL token (for elements which
> > are not in the DTD) reports an offset in the string table, meaning it needs to
> > be defined beforehand and not inline.
>
> Look at "inline string extension tokens", the format of which is a magic
> number followed by the string itself. The spec's not organized or written
> very clearly, but it certainly looks to me as though these let you insert
> arbitrary keywords on the fly if they weren't defined upfront.
>
Hmm, I looked at it again (a couple of times, it is definately not the clearest
specification ever written); it looks like textual content of attributes and cdata
is handled by the inline strings, but tags have to use LITERAL, which uses an
offset into a table. Another worry is that it doesn't appear the binary format can
extend to multiple bytes if needed - more than 64 unique tags would seem to
overflow the dictionary. These are both not problems with a binary protocol, but
just difficulties in mapping wbXML to Jabber.
>
> > I really doubt this would simplify parsing (either in terms of execution speed
> > or in Lines of Code).
>
> Sure it would. Binary XML has basically been lexed in advance, and typical
> parsers spend about 25% of their time lexing. A lot of the code complexity
> is in lexing, as well. It should be really easy to write a BXML parser, and
> you could make it plug-compatible with a regular parser like Expat, which
> would make it easy to drop into existing code. Generating XML is less
> standardized, but hopefully people have modularized their XML generators
> such that generating BXML would require changing very little code (I know I
> have.)
>
> > If you didn't "decompress" the binary format before
> > sending it into Jabber, it would require substantial changes which would
> > pretty
> > much encompass every line of code.
>
> I haven't looked at the server implementation, but if dependence on XML
> syntax is so tightly woven into it, then I have to say it sounds like it's
> badly designed. Please don't tell me the modules of the server communicate
> internally via raw XML... :-O
>
Internally loaded (shared library) components of the server communicate via
structures which are centered around DOM-like XML nodes (some of the structures
pass along additional state information, or the routing information of the packet
parsed out). Trying to get these to work with a binary structure in addition to DOM
would require heavy manipulation. Since the target user's transport isn't known,
the component which finally communicates with the user would probably need to do
the conversion over to wbXML, and internally things would remain using DOM.
External (out-of-process) components do communicate over the wire with an XML
protocol.
-David Waite
More information about the JDev
mailing list