[jdev] Necessity of stringprep support for the client

Mon Aug 20 14:56:37 UTC 2012

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/18/12 9:29 PM, Waqas Hussain wrote:
> On Sun, Aug 19, 2012 at 7:13 AM, Peter Saint-Andre
> <stpeter at stpeter.im> wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> On 8/17/12 5:57 PM, Ralph Meijer wrote:
>>> On 2012-08-17 18:22, Peter Saint-Andre wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>> 
>>>> On 8/17/12 10:16 AM, Jack Moffitt wrote:
>>>>>> Heck, it sounds like a simple little spec, maybe I'll
>>>>>> write it up over the weekend. ;-)
>>>>> 
>>>>> I suggest that the JavaScript side API be the same as the
>>>>> W3C one, so that this can act as a shim for browsers that
>>>>> don't yet have that support.
>>>>> 
>>>>> If we made it an HTTP API, then people outside the XMPP
>>>>> world could use the same thing. The only thing we'd really
>>>>> need is some modification of the stream features to include
>>>>> the API endpoint so that clients can find it.
>>>> 
>>>> Well, I'd see HTTP and XMPP as two different ways of
>>>> accessing the same service. Given that such a service could
>>>> be resource-intensive to run (in fact, the XEP would need
>>>> some security considerations about denial of service
>>>> attacks), I would think that client authentication or
>>>> registration would be necessary or strongly suggested. In the
>>>> case of XMPP, the server is in charge and I expect that it
>>>> would offer this service only to its registered users (and
>>>> any abusive users from its domain could be easily disabled).
>>>> In the case of HTTP, the story is less clear to me.
>>> 
>>> What about stringprepping (parts of) the JIDs used to connect
>>> to the server? I.e. before feature negotiation is complete and
>>> the client may start sending stanzas? I'm thinking of the
>>> stream's addressing attributes, username (SASL) and resource
>>> (resource binding).
>> 
>> Right, but the server will correct your full JID during 
>> authentication. After that, you could check every non-ASCII JID
>> or JID-part with the server-side prepping service.
>> 
>> Peter
>> 
> 
> There are four classes of JID slots relevant to this problem: 1.
> The stream tag 2. SASL 3. Top level attributes of stanzas 4. JID
> fields inside stanzas
> 
> 1. The stream tag
> 
> The server preps for you. The client doesn't need to know prepping.
> In case a hostname fails prepping, you would get a <host-unknown/>
> error.
> 
> A host-unknown error seems to be enough here. The client can show
> a helpful message saying the hostname is incorrect. What else would
> it say if it knew the hostname failed prepping? How is a prepping
> failure different from a simple typo which passes prepping as far
> as users are concerned? Users don't know what prepping is.

For sure. Hopefully this is something the client would get wrong
exactly once, from then on caching whatever format worked in its
configuration for that account. Given that this is a bootstrapping
problem, a server-side service won't be of help here. :)

> If you really really need the information, add an application
> specific error element:
> 
> <stream:error> <host-unknown
> xmlns="urn:ietf:params:xml:ns:xmpp-streams"/> <jid-malformed
> xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/> </stream:error>
> 
> Or define a new stream error.

I think <host-unknown/> is enough here.

> 2. SASL
> 
> Almost the same thing applies to SASL, except it doesn't actually: 
> Only SASLprep is certain for SASL. It's common for deployments to 
> delegate SASL to other services such as LDAP servers. SASL authcid
> may happen to be the XMPP username in many server configurations
> by default, but this is not a requirement of the protocol. Clients 
> forcefully applying nodeprep here is harmful, and a constant source
> of annoyance when authenticating against external systems.

Yes, this is something to perhaps make even clearer in 6120bis.

While working on saslprepbis and 6122bis in the PRECIS and XMPP
working groups respectively, I've been trying to align the two
approaches a bit more so that we'll have greater consistency in the
future. However, I'd appreciate further reviews from XMPP folks on
this point.

> 3. Stanzas
> 
> The server gives you back a <jid-malformed/> error. What more do
> you need?

Nothing.

> 4. JID slots inside stanzas
> 
> Now this is a sticky problem. But this problem isn't associated
> with just javascript based applications. Most (all? I tested a
> while ago and found zero; has this changed?) mobile clients and
> many desktop clients simply don't prep. Or when they do it's as
> simple as lower-casing ASCII characters and checking for a few
> forbidden ASCII characters. This requires further discussion and
> thought.

Or code. :)

> -- Now, as for the JID validating service that is being discussed:
> 
> 1. A server side service
> 
> This seems very problematic. Client gets roster with 1000 JIDs.
> Roster JIDs are not guaranteed to be normalized (might have upper
> case characters, etc). Does it send them all back for
> normalization? This will get very expensive very fast. And not just
> for the user, but the server as well.

Well, you'd prep a JID on first adding it to your roster, and the
server is supposed to check it as well at that point (so the client
shouldn't be asking the server to "pre-prep" a JID that the server is
going to be checking anyway). Thus the client isn't going to be asking
the server to prep thousands of JIDs every time it retrieves its
roster. You check a JID the first time it enters the system and then
you assume it's good.

> 2. Javascript based prepping library
> 
> Is this not considered feasible? I'd very much like to see some 
> numbers. How large are the relevant data tables?

Well, a zipped copy of the full Unicode Character Database is 2.6 mb.
Not all of that might be needed for prepping of JIDs, but much of it
will (especially if we're doing proper bidi handling). Even just the
uncompressed UnicodeData.txt file is 1.3 mb.

> And the data can be split up by language, compressed, and loaded on
> demand, so download size may not be an issue at all. You also get
> to use CDNs, since it would all be static files. The XSF could fund
> this.
> 
> 3.  <message|iq|presence to="..." from="..." xmlns:j="jidprep 
> namespace" j:jids="space separated list of JIDs"/>
> 
> The server (if it supports the feature) checks the 'to' and 'from' 
> attributes as normal, and in addition also checks the jids in that 
> prefixed attribute. It returns a jid-malformed error on failure.

Correct.

> This cleanly avoids any extra round trips for outgoing stanzas.
> For incoming stanzas, you can send the JID list to yourself (ping 
> yourself, or send a message to yourself). That said, I don't 
> personally like the idea of such a protocol.

Agreed, that's ugly. :)

Peter

- -- 
Peter Saint-Andre
https://stpeter.im/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.18 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlAyUCUACgkQNL8k5A2w/vy5/QCfTBnHL/0+wK46pqu9JLgi6x7G
C2cAnRfCMGUhUdmbl4h+zDrlg3s9RC5T
=XpSI
-----END PGP SIGNATURE-----