jxmpp icon indicating copy to clipboard operation
jxmpp copied to clipboard

Parsing unescaped jid that contains two '@' symbols.

Open damencho opened this issue 7 years ago • 20 comments

If you want to use JidCreate.entityBareFromUnescaped to parse and validate a jid, there is an issue if the jid is incorrect and contains two '@' symbols. https://github.com/igniterealtime/jxmpp/blob/master/jxmpp-jid/src/main/java/org/jxmpp/jid/impl/JidCreate.java#L453

I think both XmppStringUtils.parseLocalpart and XmppStringUtils.parseDomain should not use int atIndex = jid.indexOf('@');, but instead use int atIndex = jid.lastIndexOf('@');

WDYT?

damencho avatar Oct 23 '17 20:10 damencho

lastIndexOf doesn't work either since @ is explicitly allowed in resource parts. E.g. this is a valid unescaped JID: some/user@host\[email protected]/user@resource/foo. Escaped this would be (I think) some\2fuser\40host\[email protected]/user@resource/foo. Simple [last]indexOf-logic doesn't do this just anymore.

ibauersachs avatar Oct 24 '17 09:10 ibauersachs

I see, I was only looking at entityBareFrom methods, where there is no resource. I will try comming with a PR fr this thing and then we can discuss it.

damencho avatar Oct 24 '17 17:10 damencho

https://github.com/igniterealtime/jxmpp/pull/15

damencho avatar Oct 25 '17 23:10 damencho

I'm trying to get my head around this while still recovering from an illness. So please bear with me. I think it would be clearer for me if you could show me how you invoke a method and what happens, and what you think should happen.

My assumption right now is that you do something like

EntityBareJid jid = JidCreate.entityBareJidFromUnescaped("foo@[email protected]/bar@baz")

which yields an entity bare jid where the localpart is 'foo' and the domainpart is '[email protected]'. Which is not correct, the localpart should be 'foo\40boo' and the domainpart should be 'example.org'. And PR #15 tries to solves this by stripping the resourcepart prior the extraction of the localpart in XmppStringUtils.parseLocalpart(). Is that assessment correct so far?

Flowdalic avatar Oct 27 '17 18:10 Flowdalic

We are using EntityBareJid jid = JidCreate.entityBareJidFrom("foo@[email protected]/bar@baz") and I expected it will throw an error for incorrect jid, but with current code, it doesn't, but with my PR it does as shown in the added test in JidTest. Does this make sense?

damencho avatar Oct 27 '17 18:10 damencho

With this change the domain part will be example.org and node part will be foo@boo

damencho avatar Oct 27 '17 18:10 damencho

node part will be foo@boo

Shouldn't the localpart be escaped, and thus be 'foo\40boo'?

Flowdalic avatar Oct 27 '17 21:10 Flowdalic

Escape is done only when using all ..FromUnescaped methods and is not done using the others. So I expect JidCreate.entityBareJidFrom to throw an exception when passing a wrong jid string representation.

damencho avatar Oct 27 '17 21:10 damencho

Ahh, you used the non-Unescaped variant. Now I am with you. Could you add a test for

JidCreate.entityBareJidFromUnescaped("foo@[email protected]/bar@baz")

too and check that the localpart has the expected form, i.e. 'foo\40boo'?

Flowdalic avatar Oct 27 '17 21:10 Flowdalic

Just added the test for JidCreate.fromUnescaped and escaping.

damencho avatar Oct 27 '17 21:10 damencho

I'm looking at this mobile, but I don't think this works in all cases. / is a valid unescaped character in localparts. The current indexOf would take it as the separator for the resource though. Add a test like I mentioned above and see what happens...?

@flowdalic hope you'll be better soon!

ibauersachs avatar Oct 27 '17 21:10 ibauersachs

That can be easily fixed by not using int slashIndex = jid.indexOf('/');, but use int slashIndex = jid.lastIndexOf('/'); and adding a test with / works that way.

But I was thinking about another case. Can you have a jid that has no resource and is in the form of "foo/[email protected]"? Cause this may be parsed as no node part, domain is foo and resource is [email protected]. And how are we supposed to decide which is the correct parsing?

damencho avatar Oct 27 '17 22:10 damencho

lastIndexOf doesn't work since resourceparts can legally have /, and they don't even need to be escaped.

foo/[email protected] is IMO legal unescaped, but the meaning is unclear. It could either be foo/boo (localpart) @ server.com (node), no resource OR no localpart, foo (node), [email protected] (resourcepart). Not sure if the RFCs or XEPs say something about that case.

ibauersachs avatar Oct 27 '17 22:10 ibauersachs

The domainpart of a JID is that portion after the '@' character (if any) and before the '/' character (if any);

This means that foo/[email protected] should be: domain is foo and [email protected] is the resource. Cause for the local part we should have already identified the domain:

The localpart of a JID is an optional identifier placed before the domainpart and separated from the latter by the '@' character.

And the same for resource, we have identified domain and then parse resource:

The resourcepart of a JID is an optional identifier placed after the domainpart and separated from the latter by the '/' character.

Do you agree with those statements and the parsing of foo/[email protected]? Those are quotes from rfc6122.

damencho avatar Oct 27 '17 22:10 damencho

RFC 6122 has been obsoleted by RFC 7622. The relevant quotes are I think:

§ 3.2 Domainpart

The domainpart of a JID is the portion that remains once the following parsing steps are taken:

  1. Remove any portion from the first '/' character to the end of the string (if there is a '/' character present).

  2. Remove any portion from the beginning of the string to the first '@' character (if there is an '@' character present).

That rules applied, 'foo/[email protected]' yield, if I'm not mistaken, 'foo' as domainpart and '[email protected]' as resourcepart. So yes @damencho I think you are right. :)

Flowdalic avatar Oct 28 '17 08:10 Flowdalic

You're both right, but the RFC only deals with valid JIDs (i.e. not containing @,/,etc. in the localpart). This would be in XEP-0106.

Actually, for Jicofo, where the issue appeared, there must have been an error before:

An entity MUST NOT include the unescaped version of a disallowed character over the wire in any XML stanzas sent to another entity.

The XEP doesn't mention resources since it deals mostly deals with the mapping of existing addresses into JIDs.

ibauersachs avatar Oct 28 '17 08:10 ibauersachs

Yep, for jicofo I reverted to using smack 3.2 and I saw invalid jid exceptions and then I started researching why this doesn't happen with smack 4.

damencho avatar Oct 28 '17 14:10 damencho

Ok, reading rfc7622 how domain part is extracted seems the original code that I was trying to modify works that way, so jid foo@[email protected]/bar@baz, will be: local: foo domain: [email protected] resource: bar@baz. And there is no exception when passing this to JidCreate.entityBareJidFrom. This means my PR is incorrect. I will close it. And in our service, as we expect know what is our domain (example.org), we should validate it ourselves and see that the jid is incorrect. WDYT?

damencho avatar Oct 28 '17 17:10 damencho

And there is no exception when passing this to JidCreate.entityBareJidFrom. This means my PR is incorrect. I will close it.

I'm not sure if I can follow.

Using the current HEAD of jxmpp

EntityFullJid entityFullJid = JidCreate.entityFullFrom("foo@[email protected]/bar@baz");

yields a JID with localpart: foo domainpart: [email protected] resourcepart: bar@baz

What was the behavior with your changes applied?

Now there are two observations here: '[email protected]' is not a correct DNS name because of the '@' sign (RFC 952). And the API was likely incorrectly used, i.e. entityFullFromUnescaped() should have been used, since it's likely an unescaped JID. But if we use the Unescaped variant the result is the same for this this (and again, with the current HEAD).

Therefore I think we should improve jxmpp in the following ways:

  1. It should throw an exception for the invalid domainpart in the entityFullFrom("foo@[email protected]/bar@baz") case
  2. It should correctly parse the localpart in the entityFullFromUnescaped("foo@[email protected]/bar@baz") case

I say we concentrate on 2. for now. 1. can be fixed later as it is not so important because its mostly caused by incorrect API usage.

As far as I can tell, 2. is a tricky one, mostly because the '@' could also appear in the resourcepart. I do wonder if such an API even makes sense (and if we shouldn't deprecate it in it's current form in jxmpp): How often to user enter unescaped full JIDs somewhere into the application? What was the exact situation in Jicofo where you run into this?

Flowdalic avatar Oct 29 '17 08:10 Flowdalic

So with my changes in the PR: EntityFullJid entityFullJid = JidCreate.entityFullFrom("foo@[email protected]/bar@baz"); Will throw XmppStringprepException: Localpart must not contain '@'. (The local part is foo@boo)

As far as I can tell, 2. is a tricky one, mostly because the '@' could also appear in the resourcepart. I do wonder if such an API even makes sense (and if we shouldn't deprecate it in it's current form in jxmpp): How often to user enter unescaped full JIDs somewhere into the application? What was the exact situation in Jicofo where you run into this?

With the current implementation that I did in the PR, the problem is if there is '/' in the localpart, as @ibauersachs pointed out.

We use roomname for the conferences as [email protected], and there is an iq that invites jicofo to enter the room and orchestrate the conference. The problem was that a user entered his email in the mobile application, and this crashed jicofo :) so we fixed that(SmackConfiguration.setDefaultParsingExceptionCallback(new ExceptionLoggingCallback...) and start looking at why there were so many exceptions in this case and comparing, how jicofo was handling this with smack3. And here we are, trying to fix to throw an exception on entering invalid jids. I know that this jid in the first place should have been escaped before reaching jicofo, we should fix that, but we should make sure jicofo don't misbehave in such cases, cause third-party libraries can connect and send incorrect iqs.

damencho avatar Oct 29 '17 15:10 damencho