[jdev] Necessity of stringprep support for the client

Sun Aug 19 03:29:12 UTC 2012

On Sun, Aug 19, 2012 at 7:13 AM, Peter Saint-Andre <stpeter at stpeter.im> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 8/17/12 5:57 PM, Ralph Meijer wrote:
>> On 2012-08-17 18:22, Peter Saint-Andre wrote:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>>>
>>> On 8/17/12 10:16 AM, Jack Moffitt wrote:
>>>>> Heck, it sounds like a simple little spec, maybe I'll write
>>>>> it up over the weekend. ;-)
>>>>
>>>> I suggest that the JavaScript side API be the same as the W3C
>>>> one, so that this can act as a shim for browsers that don't yet
>>>> have that support.
>>>>
>>>> If we made it an HTTP API, then people outside the XMPP world
>>>> could use the same thing. The only thing we'd really need is
>>>> some modification of the stream features to include the API
>>>> endpoint so that clients can find it.
>>>
>>> Well, I'd see HTTP and XMPP as two different ways of accessing
>>> the same service. Given that such a service could be
>>> resource-intensive to run (in fact, the XEP would need some
>>> security considerations about denial of service attacks), I would
>>> think that client authentication or registration would be
>>> necessary or strongly suggested. In the case of XMPP, the server
>>> is in charge and I expect that it would offer this service only
>>> to its registered users (and any abusive users from its domain
>>> could be easily disabled). In the case of HTTP, the story is less
>>> clear to me.
>>
>> What about stringprepping (parts of) the JIDs used to connect to
>> the server? I.e. before feature negotiation is complete and the
>> client may start sending stanzas? I'm thinking of the stream's
>> addressing attributes, username (SASL) and resource (resource
>> binding).
>
> Right, but the server will correct your full JID during
> authentication. After that, you could check every non-ASCII JID or
> JID-part with the server-side prepping service.
>
> Peter
>

There are four classes of JID slots relevant to this problem:
1. The stream tag
2. SASL
3. Top level attributes of stanzas
4. JID fields inside stanzas

1. The stream tag

The server preps for you. The client doesn't need to know prepping. In
case a hostname fails prepping, you would get a <host-unknown/> error.

A host-unknown error seems to be enough here. The client can show a
helpful message saying the hostname is incorrect. What else would it
say if it knew the hostname failed prepping? How is a prepping failure
different from a simple typo which passes prepping as far as users are
concerned? Users don't know what prepping is.

If you really really need the information, add an application specific
error element:

<stream:error>
  <host-unknown xmlns="urn:ietf:params:xml:ns:xmpp-streams"/>
  <jid-malformed xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/>
</stream:error>

Or define a new stream error.

2. SASL

Almost the same thing applies to SASL, except it doesn't actually:
Only SASLprep is certain for SASL. It's common for deployments to
delegate SASL to other services such as LDAP servers. SASL authcid may
happen to be the XMPP username in many server configurations by
default, but this is not a requirement of the protocol. Clients
forcefully applying nodeprep here is harmful, and a constant source of
annoyance when authenticating against external systems.

3. Stanzas

The server gives you back a <jid-malformed/> error. What more do you need?

4. JID slots inside stanzas

Now this is a sticky problem. But this problem isn't associated with
just javascript based applications. Most (all? I tested a while ago
and found zero; has this changed?) mobile clients and many desktop
clients simply don't prep. Or when they do it's as simple as
lower-casing ASCII characters and checking for a few forbidden ASCII
characters. This requires further discussion and thought.

--
Now, as for the JID validating service that is being discussed:

1. A server side service

This seems very problematic. Client gets roster with 1000 JIDs. Roster
JIDs are not guaranteed to be normalized (might have upper case
characters, etc). Does it send them all back for normalization? This
will get very expensive very fast. And not just for the user, but the
server as well.

2. Javascript based prepping library

Is this not considered feasible? I'd very much like to see some
numbers. How large are the relevant data tables? And the data can be
split up by language, compressed, and loaded on demand, so download
size may not be an issue at all. You also get to use CDNs, since it
would all be static files. The XSF could fund this.

3.  <message|iq|presence to="..." from="..." xmlns:j="jidprep
namespace" j:jids="space separated list of JIDs"/>

The server (if it supports the feature) checks the 'to' and 'from'
attributes as normal, and in addition also checks the jids in that
prefixed attribute. It returns a jid-malformed error on failure.

This cleanly avoids any extra round trips for outgoing stanzas. For
incoming stanzas, you can send the JID list to yourself (ping
yourself, or send a message to yourself). That said, I don't
personally like the idea of such a protocol.

--
Waqas Hussain