[JDEV] jabberd patch

Wed Feb 26 18:32:52 CST 2003

Richard Dobson <richard at dobson-i.net> wrote on 27-2-2003 1:17:37:
>
>> In fact when I shutdown my windows jabberd 1.4.2 with ctrl-c it 
>> doesn't say anything at all.. it just disconnects the socket.
>
>Sure but the win32 server is not the best thing to judge by, it has 
>notorious problems, hopefully jabberd2 will solve them. Also that just 
>seems to be another implementation problem to me so doesn't strictly 
>have much bearing on these discussions.

I don't know if the linux version is any different. If it does send a 
stream:error my guess is that it sends a 
<stream:error>Disconnected</stream:error> on shutdown. In wich case I 
still definatly want to reconnect. 

But this is besides the point, you shouldn't be looking at the CDATA, 
as soon as you do this you're creating problems. Just because the most 
used jabber server(s) happen to put "Disconnect" there doesn't mean you 
should rely on it. If all clients would then it would basically become 
a depcrecated field (cause you can't use it for anything else) that 
will live on and on and on cause if you change it it will break clients.
 Some clients that implement this will be less likely to "upgrade" to 
 the new solution.. if they don't feel like fixing the same problem 
 twice. 

>> Look, I already quoted from the documentation that there can be more
>> then one reason why a jabber-server would send a stream:error. What 
>> the description should be it doesn't say anywhere, nor should it 
>> because it's supposed to be a human readable description, it's not 
>> meant for letting your client distinguise what type of stream error 
>> it is. 
>
>Of course it shouldnt be used if you have an alternative, but at the 
>moment until the stream error code discussions have finalised we dont, 
>it is the only thing we can use to even remotely guess what is 
>happening, im not arguing that the protocol shouldn't be altered to 
>make it better using the error codes but we need a solution now for 
>all the misbehaving clients until jabber servers with the new protocol 
>have been deployed everywhere (possibly quite a long way off).

If you want to put a hack in your client that's fine by me but it 
doesn't mean other clients are obligated to. IMHO it's the protocol and 
the server that are misbehaving since they kick the client without 
giving any reason. 

>
>> Can you agree with me on this?
>
>Yes but as ive said above we need to work with what we have at the 
>moment to solve the problems until servers with the updated protocol 
>have been widely deployed. So if you dont want to use the CDATA to 
>determine the reason for disconnection then we will just need to have 
>it so clients must not try auto-reconnecting when they get a stream 
>error followed by a stream end, but if the client gets an error code 
>(because they are using an updated server) they can use that to 
>determine if they can try auto-reconnecting, but if there is no error 
>code must not try to auto-reconnect (the way Exodus works).

Clients and server should be based on documentation of the protocol, 
not on the implementation of 1 or many servers. As long as the protocol 
is broken you can't fix this without breaking something else. The CDATA 
as descriped in the protocol is NOT meant to determine the reason for 
so my strong advice to any client-developer is: Don't! 

>
>> If you can then maybe you can also agree with me that according to 
>> the documentation there can be different causes, and that some 
>> clients will want to auto-reconnect in some of those cases.
>
>Yes but if you dont want to use the CDATA to try and find out what the 
>error is we must just use the lowest denominator, and because at least 
>one reason for disconnection means you shouldn't reconnect if you get 
>a stream:error you must not reconnect.

So you can choose between a few options here: 
1. accept the protocol is broken and not implement reconnect, thus 
compromising on the functionality of your client (some people will 
switch to clients with a different client if they can) 2. Introduce a 
bad hack to solve this bug in some cases. Compromises functionality in 
some other cases. 3. Leave things as they, accept the *protocol* is 
flawed and should be fixed. The more the problem will occur the faster 
it will be fixed properly and the sooner serveradmins will upgrade. 
Don't compromise on functionality because the error is in the protocol 
and servers. 

As you can see I prefer 3. You can choose 1 or 2 if you want, but you 
can't acuse developers who choose 3 of misbehaving. 

>> As far as I know jabberd 1.4.2 does this, yes. But it shouldn't make 
>> a difference what it says. Maybe jabberd2 says 
>> <stream:error>Replaced by another session: 
>> disconnected</stream:error></stream:stream>, it would make a lot 
>> more sense to me but in your world this would mean all the clients 
>> would be broken again? 
>
>Ah but hopefully we can get the stream error codes into jabberd2 
>before it goes final so they can be used to reliably determine the 
>reason for disconnection.

I hope so too.. there has been discussion on the XMPPWG but I haven't 
read the outcome, if any (do the mailinglist archives have a search 
function yet?) 

>
>> They only have this "bug" because the server doesn't let them know 
>> why they are disconnected. If Exodus fixes this with a hack that 
>> scans for "Disconnected" (wich I find hard to believe since it 
>> really *is* such a big hack) or if it simply doesn't reconnect at 
>> all on <stream:error> that probably work on jabberd 1.4.2 and maybe 
>> some others too, but it is and will be a hack that no other client 
>> has to have, *since it's a hack*. That's why the rest don't HAVE to 
>> manage, or should IMHO. 
>
>Exodus seems to fix this by just detecting stream:error's and not then 
>trying to reconnect which I think is perfectly reasonable for other 
>clients to do until stream:error codes are widely spread in servers, 
>but as ive said to solve the problem we need to handle it for all the 
>thousands of jabberd servers that are already deployed, not just wait 
>for the protocol change since that doesn't help the already deployed 
>servers.

I really still don't think you should solve problems by introducing 
hacks into the client. If we fix the protocol, and then upgrade the 
clients *properly* this will only be a good stimulation for those 
thousands of servers to upgrade. If we introduce a hack servers are 
less likely to upgrade and clients less likely to implement a proper 
fix because it already works "properly" in a large part of all cases. 
This way the hack will become "semi-offical" and we all know what kind 
of problems that brings along.. 

>Ah well since there are not very many different servers available that 
>have a significant deployment I dont see this as a problem, 

This is exactly the kind of thinking I'm opposing here. Just because a 
(very) large part of the servers supports your hack that's a reason to 
go ahead with it? This is what incompatabilities are made of! It goes 
directly agains the thinking behind open standards because it corrupts 
them. 

Example: 
Let's say that jabberd1.4.2 for win32 and linux do not send a 
<stream:error> when shutting down. According to you I'm not allowed to 
reconnect if it sends a stream:error, but it doesn't so I reconnect. 
Now I'm writing a new jabber server SuperExtraXMPPJabberD. When it 
shuts down to restart it's very polite and sends a <stream:error>The 
server is restarting</stream:error> (or if you're still into matching 
CDATA it sends "Disconnected" instead). However users start complaining 
that their clients won't autoreconnect like with jabberd. I "explain" 
to them this lenghty discussion, but some don't care and install a 
different server. Now my marketshare is even smaller :( 

I have some free time.. and there's a new standard with errorcodes. I 
implement it.. however, the clients don't implement it as fast as they 
could cause they already solved the problem. Well solved.. they used 
some hack you suggested. But hey, it works on all the servers with a 
significant deployment eh? 

Anyway.. it's a very small issue.. if some client developers want to go 
with it it won't keep me awake at night (if errorcodes become available 
I hope they'll upgrade to is ASAP though). But argueing that other 
client developers should stimulate this, or suggest they misbehave if 
they don't... well ok it still won't keep me awake at night but it's 
enough to pull me into this discussion we're having :) 

>as since 
>most new servers will contain the stream:error codes being worked on 
>(i.e. the newer protocol specs) so it is really only the currently 
>deployed (legacy) servers we really need to worry about.

I don't think it's the client developers who should be worrying about 
deployed legacy servers. Rather the developers of those servers 
(keeping them updated) and the serveradmins (keeping them upgraded). 
The role of the client developers in this should be keeping the clients 
updated, this will stimulate serveradmins to upgrade, and server 
developers to keep updating since client developers focus their energy 
on implementing the new features they introduce. They do not instead 
use their energy to worry about legacy servers and introducing hacks to 
make it work on those with the largest deployment. 

You can disagree with me on this if you want, but I doubt you'll ever 
convince me of something else. 

>So overall I think we should just not auto-reconnect upon the 
>reception of a stream:error followed by a stream end, but if we 
>receive an error code (currently being worked on) in the stream:error 
>which tells us the reason we can use that to do different things.

When the new stream:error codes are there and enough servers support 
this I would definatly implement this behaviour (no more reconnect on a 
stream:error without code), but not before. This by the way is then no 
longer a hack or corrupting the standard, because servers that don't 
support this are outdated now, and should be updated to the new 
standard. Having my client no longer support it will have the same 
positive effect on server-developers and server-admins who are behind 
on updating and upgrading. 

But consider your option, if we all drop reconnecting from our clients 
today, the problem of "fighting for resources" will suddenly disappear 
from the "real world". Admins won't be bugged by it anymore since it 
doesn't happen anymore, wich will mean server developers will put 
fixing it a lower spot on their todo lists. Wich means client 
developers and the users are stuck even longer without proper 
reconnecting.. 

>
>> Proper error-codes and documented behaviour for closing a stream and
>> rejecting login because of duplicate sessions is needed. A means of
>> indicating that you don't want to "hijack" another session is nice 
>> too, since it increases functionality for all clients that want to 
>> implement it.
>
>Yes I think some way of a client specifying that it doesn't want to 
>hijack an existing session is the best way to go rather than 
>standardizing the hack Wes has done, since once the anti-hijack is 
>done the hack is unnecessary and bad for other clients.

I think stream:error codes are needed anyway (and I think you agree 
with me on this). 

That doesn't mean it's wrong to give the serveradmin choice. I 
completly agree with Matthew M. on this. We might want to consider an 
approach where there are 3 ways a client can authentice 

1. specifying it doesn't want to hijack the session if it exists (in 
wich case it should always get a 409 if the session exists) 

2. not specifying anything (legacy and clients that don't (want to) 
support this. It's up to the admin to decide wether to allow them to 
hijack sessions or not. On a public jabber server I'd allow this, 
unless many what-are-by-then-"old" clients that don't support 
stream:error codes yet cause too many problems) 

3. specifying that it wants to hijack the session if it exists (on 
*any* public jabber server I would allow this, since only the clients 
that properly support stream:error codes should use this option so 
there will be no fighting for resources. In the "paranoia" case of 
Matthew M. or the usefull case of Wes you could still 409 this) 

This would also allow a nice extra, a client could first login with 
method 1. If it gets back a 409 it could prompt the user if it wants to 
hijack (we might need a better term here.. I think JDEV has drawn 
enough Echelon attention by now) the session or not and then (if the 
users wants this) try method 3 

Such a proposal would have the best chance of getting accepted if it's 
backed by an implementation though probably (who knows.. I might have 
to write a custom mod_auth in the not too distant future). 

-- 
Tijl Houtbeckers
Software Engineer @ Splendo
The Netherlands