caddy
caddy copied to clipboard
disable QUIC Retries when not under load
QUIC defines an address validation mechanism: https://tools.ietf.org/html/draft-ietf-quic-transport-25#section-8.1.2. A server that receives an Initial packet from a new client can send a Retry packet with an opaque token, and the client then includes this token on subsequent packets.
This costs one network round trip, so we don't really want to use this on every connection. This behavior can be configured in quic-go using the quic.Config's AcceptToken callback. Currently, Caddy doesn't set any custom callback, which means that quic-go will send a Retry for every client that hasn't connected to the server before.
Can we find a better logic for this?
Sure, letsee, I'm not super familiar with this concept yet so I'll have to do some reading.
It looks like tokens in this context are used for "address validation" -- to ensure the client can receive packets at the address they say they are coming from, to help mitigate amplification attacks. Makes sense. Are there security benefits of just always accepting (requiring) a token?
From what I understand so far, though, ideally we only require a token when there is heavy load on the server, which could be a potential amplification attack.
If I'm right so far, then yes I'd definitely like to implement this callback. What do you think is a ballpark good estimate of "under load"? What kind of metric should I use?
It looks like tokens in this context are used for "address validation" -- to ensure the client can receive packets at the address they say they are coming from, to help mitigate amplification attacks. Makes sense. Are there security benefits of just always accepting (requiring) a token?
Your understanding is correct. There's no security benefit of requiring a token (and a one round-trip penalty). There are two DoS benefits of doing so:
- By requiring the client to prove that it didn't fake the sender IP address, we can defer the expensive crypto operations needed for the TLS handshake.
- It will be harder to use the server for a reflection attack (although QUIC comes with a built-in limit for that already: a server won't send 3x more bytes than it has received from an unverified client).
From what I understand so far, though, ideally we only require a token when there is heavy load on the server, which could be a potential amplification attack.
Amplification attack, or DoS attack against the server itself.
If I'm right so far, then yes I'd definitely like to implement this callback. What do you think is a ballpark good estimate of "under load"? What kind of metric should I use?
I was hoping that Caddy already has some kind of metric that we could reuse here. One metric could be CPU usage, another one could be number of accepted connection with the last x seconds. Not really sure what makes sense here.
Thanks for the information! I think I understand then.
I haven't implemented metrics into Caddy 2 yet. But when I do, we can definitely do this! Thanks for requesting it.
@hairyhenderson Would your metrics PR (#3709) be able to help gauge when a server is "under load"?
@mholt possibly!
In general, the process_cpu_seconds_total metric could be used (on Linux and Windows only unfortunately) to track CPU usage, while caddy_http_request_duration_seconds_count could be used to track accepted connections, or caddy_http_requests_in_flight to track currently-in-flight connections.
In the context of this issue though, some more work may need to be done. The metrics exposed in #3709 are primarily intended as a method of externally monitoring the service, and while there is a way to get the in-memory metrics from the Prometheus client's registry, it can be very expensive. See the testutil.ToFloat64 docs for more discussion on that front.
The metrics PR may not be totally orthogonal here though, as it does give us some hooks to track some specific request-rate metrics for this particular purpose.
As far as tracking CPU usage in-process, that may be more difficult. We could replicate what the Prometheus client does, but that may be quite expensive as well as it involves reading from the procfs.
Ultimately I think keeping a simple metric of "how many requests did I start handling over the past N seconds?" in-memory might be the best indicator of load. Obviously it would need to be tunable for different systems.
Implemented in #4707, will be merged and released soon.