[JDEV] Ramblings - feel free to join in :-)
Dennis Noordsij
dennis.noordsij at wiral.com
Thu Dec 14 04:28:08 CST 2000
Hi,
I have had 2 things on my mind for a while and would like to take the
opportunity to hear from other people what they think would work and wouldn't
work, or maybe come up with a better idea or implementation.
The first one concerns bandwidth vs horsepower. I think we can pretty safely
assume that :
- in our own jabber server farm bandwidth is plentily available, and the only
thing we are worried about is raw power of our servers. Any optimizations
would be ones that will get more messages routed in the same time, even if
that takes up a little more bandwidth inbetween jabber components (think of
the main JSM to transports to xdb databases - all on a small LAN).
- with regard to the "outside", ie users connecting via TCP/IP over the
internet we value bandwidth much more. It is alright if the client has to do
a little more work if it means it takes less bandwidth to get a message
across.
How to do this, without affecting jabber server code at all, and clients
minorly?
Why not bzip2 the xml stream? The client would simply stream through a bzip2
function before sending it out over the socket, this would be quite easily
implemented in clients. On the server side, since any serious setup will use
jpolld multiplexing machines only jpolld has to know about bzip2, when the
XML reaches the jabber server it is plain text XML again. Likewise, why not
stream through an SSL component (with compression), once again, on the client
side it would make no difference, on the server side the jpolld's could be
linked again an SSL library making use that hardware SSL acceleration board I
see in every issue of LinuxJournal :-)
Even without the SSL, bzip2ing a stream would help tremendously as XML is
basically text and compresses quite well. Only jpolld would have to be fitted
with a bzip2 component (similar to the xstream) and clients could even use a
local proxy that does it for them. Wouldn't the bandwidth savings be
substantial enough to warrant implementation of this? This way we can still
keep using the original protocol without resorting to small proprietary
binary tags as someone else suggested, thus keeping everything open.
My second thought is about scalability of the core of the jabber server. We
can already farm out incoming connections to several jpolld multiplexers,
database lookups to a farm of xdb caching lookuppers (yeah that's really a
word! :), but the central JSM for a domain (assuming btw that I want all
users in the same domain, ie user at mydomain.fi) is still limited to how fast
that one machine can route packets, few hundred/second? Forgive me if there
already is a much more elegant solution than this :-) Here goes:
Our domain is jabber.com
Machine 1) internal name Apple
Connected to Jpolld1-A and Jpolld1-B
Machine 2) internal name Orange
Connected to Jpolld2-A and Jpolld2-B
User dennis at jabber.com logs in, round robin DNS puts him on Jpolld1-B.
My real JID is dennis at jabber.com/work
Internally Machine 1 also knows me as dennis at jpolld1-B (already done)
Machine 1 now propagates to all other machines (each machine is connected to
every other machine) "dennis at jabber.com/work - dennis at apple/work".
Now every single machine in our farm has a hashtable entry that says
"dennis at jabber.com/work - dennis at apple/work", except for machine 1 which has
"dennis at jabber.com/work - dennis at jpolld1-B"
However the amount of memory needed to store one entry would be so small that
this would still work, AND we can dedicate the storing of this entries to a
special machine with the server machine simply fetching it from the dedicated
machine and caching it for a while. Note that only this particular string is
stored, NOT the actual session data, that is only stored on the "home"
server, ie the one that you actually connected to.
Now, harry connects, round robin puts him on Jpolld1-A
Propagation takes place:
Machine 1)
dennis at jabber.com/work - dennis at jpolld1-B
harry at jabber.com/school - harry at jpolld1-A
Machine 2)
dennis at jabber.com/work - dennis at apple/work
harry at jabber.com/school - harry at apple/school
Harry sends a message to Dennis, Machine 1 looks in it's hashtable, sees the
message has to be delivered to jpolld1-B and does so.
Now, susan connects, round robin puts her on Jpolld2-B
Propagation:
Machine 1)
dennis at jabber.com/work - dennis at jpolld1-B
harry at jabber.com/school - harry at jpolld1-A
suzan at jabber.com/home - suzan at orange/home
Machine 2)
dennis at jabber.com/work - dennis at apple/work
harry at jabber.com/school - harry at apple/school
suzan at jabber.com/home - suzan at jpolld2-B
Now Suzan sends a message to Dennis, it goes from Suzan's client via
jpolld2-B to her machine, orange.
Orange looks, sees that dennis at jabber.com has one session, which is
dennis at apple/work. Machine 2 (orange) now sends the message to Machine 1
(apple), apple receives it, sees that this session is managed by jpolld1-B
and sends it to dennis at jpolld1-B
If this would work, how much traffic would it save? Btw if this already works
like that please tell me how :-) Would it hard to implement? Are there issues
I have totally missed that would make this impossible?
By using this technique, a number of servers, each having for example 2
jpolld multiplexors, you can also implement load balancing. Although I don't
remember right now I believe there was a redirect stream error so a server
can redirect a client to a different IP? Based on statistics with regard to
load and message flow between components and sockets/bandwidth usage per
jpolld and cpu/memory consumption per server an intelligent redirecting
policy can be dynamically maintained.
Then again, maybe I am just rambling :-))
Hope to hear some ideas,
cheers!
Dennis
PS On a sidenote, I managed to write what I initially started as a jabberd
component (see Transport different approach thread) by using jpolld as a
reference and using the libjabber and libxode libraries to write a standalone
executable. Doesn't depend on etherx for my connections, allows pthreading
and basically rocks :) libxode is a very nice library .. kudos guys.
More information about the JDev
mailing list