ircv3-specifications icon indicating copy to clipboard operation
ircv3-specifications copied to clipboard

Add DNS SRV records

Open emersion opened this issue 3 years ago • 26 comments

I've seen a lot of users try to connect to "libera.chat" instead of "irc.libera.chat". This results in connection timeouts.

This is an attempt to improve the status quo.

Previous proposals:

  • https://github.com/ircv3/ircv3-specifications/pull/59
  • https://www.devever.net/~hl/md/irc-srv

emersion avatar Dec 10 '21 10:12 emersion

As much as I like SRV, I wonder if this is the right direction. These days, the internet seems to be moving to "well-known" HTTP URLs, because web apps can't use the DNS.

Obviously web apps wouldn't connect to irc.libera.chat/6697 anyway; but we could define a single well-known for both normal sockets and websockets so that it's less technical overhead for network admins.

progval avatar Dec 10 '21 10:12 progval

You may want to clarify that the example record should be in the IN (Internet) namespace.

    _ircs._tcp IN SRV 0 1 6697 irc.example.org.

aaronmdjones avatar Dec 10 '21 10:12 aaronmdjones

These days, the internet seems to be moving to "well-known" HTTP URLs, because web apps can't use the DNS.

I'd prefer SRV records, because I don't want to depend on an HTTP library in my IRC clients.

emersion avatar Dec 10 '21 10:12 emersion

You may want to clarify that the example record should be in the IN (Internet) namespace.

RFC 2782 doesn't use that syntax, nor does RFC 6186.

emersion avatar Dec 10 '21 10:12 emersion

I'm ok with including IN, but ok with omitting it as well, as the other two classes (Hesiod and chaosnet) are so incredibly unlikely. Hell, even Hesiod itself switched to IN later.

However, I'd like the spec to explicitly say which hostname – the original input or the SRV target – should be used for TLS host verification. In other words, does the certificate need to be for libera.chat (like with cnames) or for irc.libera.chat (like with MX records)? That's something that took me a while to figure out when setting up Matrix, and it tends to vary between SRV-using protocols in general.

grawity avatar Dec 10 '21 10:12 grawity

Verifying the certificate against the SRV target would be dangerous; it allows an active MITM to alter the SRV reply to get you to validate against a domain name they control.

aaronmdjones avatar Dec 10 '21 11:12 aaronmdjones

Verifying the certificate against the SRV target would be dangerous; it allows an active MITM to alter the SRV reply to get you to validate against a domain name they control.

Indeed, which is probably why Matrix chose to handle it like CNAME and use the original domain name.

But many other SRV consumers do use the target domain instead (for example, LDAPS), either for historical reasons or because DNSSEC.

In the end, I don't care which way you decide to do it, I care about whether it'll be documented in the spec.

grawity avatar Dec 10 '21 11:12 grawity

Indeed, which is probably why Matrix chose to handle it like CNAME and use the original domain name.

For many users (in particular, people using ACME with the http-01 challenge), if the root domain and the IRC domain point to different hosts, it doesn't seem practical to get a certificate covering both domains onto the IRC host. So it seems in order to use this, you'd have to transfer one of the certificates between servers (and then expose both certificates and rely on SNI).

IMO the concerns about MITM are sufficiently serious that this should not be used as a means of automatically redirecting users to the correct server --- it should be at most a mechanism to suggest to the user that they reconfigure, e.g. a dialog box saying "there's no IRC server on libera.chat; did you mean irc.libera.chat?"

I'd prefer SRV records, because I don't want to depend on an HTTP library in my IRC clients.

Just clarifying: AFAIK there's no way to look up a SRV record in JavaScript, so we're talking about desktop clients?

slingamn avatar Dec 10 '21 18:12 slingamn

For reference, the Matrix spec (https://spec.matrix.org/latest/server-server-api/#resolving-server-names):

[…] a server is found by resolving an SRV record for _matrix._tcp.<hostname>. This may result in a hostname (to be resolved using AAAA or A records) and port. Requests are made to the resolved IP address and port, using 8448 as a default port, with a Host header of <hostname>. The target server must present a valid certificate for <hostname>.

I agree we should require the certificate to be valid for the original hostname.

For many users (in particular, people using ACME with the http-01 challenge), if the root domain and the IRC domain point to different hosts, it doesn't seem practical to get a certificate covering both domains onto the IRC host. So it seems in order to use this, you'd have to transfer one of the certificates between servers (and then expose both certificates and rely on SNI).

Let's go through the possible setups for network operators here:

  1. A single server for example.org and irc.example.org: just share the cert on the filesystem
  2. One server for example.org, multiple IRC servers behind irc.example.org sharing the load (e.g. Libera Chat): then http-01 can't be used anyways, since the server which will have to complete the challenge is random. I bet these networks are already using dns-01, can someone from Libera confirm?
  3. A single server for example.org and a single server for irc.example.org: these are the problematic setups.

How often does (3) happen in practice? It sounds like sharing the certs would be a minor annoyance in this specific case.

IMO the concerns about MITM are sufficiently serious that this should not be used as a means of automatically redirecting users to the correct server --- it should be at most a mechanism to suggest to the user that they reconfigure, e.g. a dialog box saying "there's no IRC server on libera.chat; did you mean irc.libera.chat?"

This severely degrades the client's UX. I think with the TLS cert requirement the MITM concerns are resolved.

Another way to resolve them would be to add a subdomain requirement: the SRV target MUST be a subdomain of the original hostname. This wouldn't cover rarer cases like ubuntu.comirc.libera.chat and this doesn't seem as strong as the TLS cert requirement.

Just clarifying: AFAIK there's no way to look up a SRV record in JavaScript, so we're talking about desktop clients?

Yes, only clients connecting via TCP are taken into account here. JavaScript clients don't connect via TCP as noted above, and are typically configured by the network operator themselves, so wouldn't benefit from a discovery mechanism regardless.

emersion avatar Dec 11 '21 09:12 emersion

One server for example.org, multiple IRC servers behind irc.example.org sharing the load (e.g. Libera Chat): then http-01 can't be used anyways, since the server which will have to complete the challenge is random. I bet these networks are already using dns-01, can someone from Libera confirm?

We are. Adding an extra hostname to our certificates requires no further effort on our part.

aaronmdjones avatar Dec 11 '21 12:12 aaronmdjones

I think with the TLS cert requirement the MITM concerns are resolved.

ACK, this resolves my concerns.

slingamn avatar Dec 11 '21 23:12 slingamn

Latest change looks good to me.

grawity avatar Dec 12 '21 10:12 grawity

Client implementation: https://git.sr.ht/~emersion/soju/commit/cb695edecb0aaf8b4d3b2d3a500ada78a5a67621

emersion avatar Dec 21 '21 14:12 emersion

Hmm out of curiosity, are the networks in question planning to use SRV purely as a redirect (single record pointing to the existing irc round-robin), or would they use SRV directly for load-balancing (multiple records pointing to individual servers)?

That is, should clients expect to be able to just take srv[0] like in soju, or should they at least build a flat list from all records, or should they go all the way and handle priorities/weights?

grawity avatar Dec 22 '21 07:12 grawity

The Go library already takes care of the shuffling, srv[0] is random from the list: https://cs.opensource.google/go/go/+/refs/tags/go1.17.5:src/net/dnsclient.go;drc=refs%2Ftags%2Fgo1.17.5;l=194

emersion avatar Dec 22 '21 08:12 emersion

IRC networks conforming to this specification MUST publish an SRV record with the "ircs" service label

The expected behaviour for a conforming network is very clear and explicit, however we do not cover what circumstances a conforming IRC client should query for a SRV record, or if they must etc always check it etc. The linked client implementation in soju uses heuristics such as the end user did not specify a port to use.

Should we strengthen up the desired client behaviours in the spec?

kylef avatar Jan 01 '22 11:01 kylef

It seems appropriate to me to leave this implementation-defined. I had actually imagined this as a fallback that can be used when the initial connection attempt fails.

What would be gained by mandating a specific client behavior here?

slingamn avatar Jan 01 '22 23:01 slingamn

What would be gained by mandating a specific client behavior here?

Clients can offer a consistent experience in regards to DNS SRV records. There isn't a suggestion, recommendation, or should clause at the moment.

For network or server operators making a decision if they want to add DNS SRV records, one consideration would be how they are used by clients and the net user experience which cannot be determined by this specification. Each client doing something different with the record may hamper the usability of the feature, or desirability of implementation.

kylef avatar Jan 02 '22 14:01 kylef

@kylef I take your point, but, given that there is no clear candidate for a recommendation, a SHOULD is toothless and a MUST incentivizes implementations that disagree with the MUST to ignore the specification altogether.

Maybe we should start collecting potential client behaviors so that either (a) one of them could be selected as a SHOULD (b) they could all be listed in a non-normative section?

slingamn avatar Jan 02 '22 19:01 slingamn

I guess I should state my own priorities: I care a lot about the handshake being fast, so I don't want to add a recommendation (even a SHOULD) that would mandate a SRV lookup (even with caching) in the case where the client is already "correctly" configured.

slingamn avatar Jan 02 '22 19:01 slingamn

FWIW we've added a basic SRV record for libera.chat, which is now also in our certificate SANs. I'd love to know if people are finding this useful. We might consider using the load balancing features of SRV instead of just pointing to the round robin if a significant number of clients intend to respect them.

$ dig +short srv _ircs._tcp.libera.chat
0 1 6697 irc.libera.chat.

It's also duplicated on irc.libera.chat, which I'm less sure about, but it'd be nice to get away from well-known ports altogether.

As for the cert debate, MTAs have been relying on DNSSEC for this forever. Can we do the same? (But I also really don't mind just using the original hostname for validation if that's what everyone wants)

edk0 avatar Jan 30 '22 10:01 edk0