Intra icon indicating copy to clipboard operation
Intra copied to clipboard

doh: use tls session cache

Open ignoramous opened this issue 1 year ago • 1 comments

In our experiments with rethinkdns, employing a session cache reduces data consumed by DoH by 3x (500mb/mo down to 180mb/mo) and latency by upto 4x.

ignoramous avatar Oct 16 '24 20:10 ignoramous

cc: @fortuna unsure if this is better or worse from an anti-censorship pov. I imagine it shouldn't affect it at all (given session resumption has no affect on SNI ext).

ignoramous avatar Oct 18 '24 06:10 ignoramous

Session resumption has an effect on privacy as it allows to track the same user over time. Session ticket is essentially a cookie.

link2xt avatar Nov 06 '24 02:11 link2xt

In this case the session is gone after the session is terminated, so it's not long-lived, similar to a browser.

I believe the DNS client also issues multiple queries on the same connection, which already lets you correlate queries. But IETF RFCs generally recommend connection reuse:

  • DNS over TCP: https://datatracker.ietf.org/doc/html/rfc7766#section-6.2.1
  • DNS over TLS: https://datatracker.ietf.org/doc/html/rfc7858#section-3.4

fortuna avatar Nov 06 '24 17:11 fortuna

It seems like we need to better understand why the performance increase. Perhaps the connection reuse is not working?

fortuna avatar Nov 06 '24 17:11 fortuna

This article provides some helpful privacy context, with a link to a research: https://venafi.com/blog/tls-session-resumption/

Perhaps we should clear the cache every minutes or so. It's not super clear how much more privacy one would get though, especially if we can properly rely on the connection reuse.

fortuna avatar Nov 06 '24 17:11 fortuna

Perhaps we should clear the cache every minutes or so.

Typical of some DoH / DoT public resolvers to allow session resumption for multiple days.

openssl s_client -connect one.one.one.one:853 -reconnect shows 2 days, for example.

Unsure what Private DNS does (believe it was implemented by Benjamin?), but it'd interesting to look.

if we can properly rely on the connection reuse

From what I've observed, Go stdlib does reuse connections for http, though on phones especially, one'd want to steer clear of longer keepalives.

ignoramous avatar Nov 07 '24 02:11 ignoramous

After trying to get any latency reduction using TLS 1.3 session resumption in https://github.com/deltachat/deltachat-core-rust/pull/6182 and looking at the diagram https://www.rfc-editor.org/rfc/rfc8446#section-2.2 I don't understand where 4x latency reduction can come from. Could be session establishment is slow on the server side so clients requesting a new session get some delay, but in terms of RTT you only gain 1 RTT if you send your request in early_data. Otherwise with normal TLS 1.3 handshake you can send the request in response to Server Hello which is as good as it can be without 0-RTT.

Bandwidth reduction makes sense, especially if server certificates are not compressed.

link2xt avatar Nov 10 '24 08:11 link2xt

I don't understand where 4x latency reduction can come from

Which resolvers are you testing with? We observed up to 4x reduction for Rethink's upstreams, which are not as expansively deployed nor are run on powerful machines (1/16th of a vCPU per VM and 40 such VMs across 20+ regions) as some of the other public DoH/DoT resolvers may be. For Rethink DNS, my theory is the efficiency comes a combination of things, incl 1-RTT (resumption) and adaptive TLS record sizing (set to 1280 - TLS+TCP+IP header overheads).

Without session resumption (+ adaptive resizing), it was very common to see 3KB/4KB requests (using the DoH client from our fork of Intra) per DNS query (sans conn reuse).

1 RTT if you send your request in early_data

Believe, TLS v1.3 early data (which is 0-RTT but not as efficient if TCP is Nagle'd) is not impl / disabled by most HTTP servers & reverse proxies (due to request idempotency concerns?).

ignoramous avatar Nov 10 '24 09:11 ignoramous