quinn
quinn copied to clipboard
Endpoint stats interface
In Anza, we need to get more insight into the resources being used by quinn in servicing quic handshakes. This PR seeks to add an initial interface to the Endpoint struct to allow for getting more detailed stats than just the open connections, and some initial data relevant to the handshakes. On a side note to the Quinn devs, if there is a good way to support getting the current number of open handshakes, we would be interested in know it. Currently, it seems that the Endpoint is not signaled when a Connecting future finishes, and to add such an event would require some refactoring, not to mention handling an extra event, locking, etc.
@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests. We merely accept all incoming connections and then immediately drop some Connecting future based on our own logic.
@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests.
Why not? I think we had some Solana people look at that as an option for this exact reason.
@Ralith @djc one thing that we're particularly interested in is related to the resource usage from crypto, especially in accept() because we don't currently reject any incoming connection requests.
Why not? I think we had some Solana people look at that as an option for this exact reason.
Mostly because we currently only do some basic rate-limiting and nothing more. It's probably a good idea to look at using reject when over the rate limit instead of accepting and immediately dropping :sweat: but the thing is that we currently don't really have a great idea of what the limit should be, since we don't have insight into the resource usage from handshakes, especially crypto (but we think it's significant).
Would you guys have any ideas of how we might be able to get some better numbers on e.g. the mem usage from crypto in accept() before the future is returned?
Can you elaborate on "mem usage from crypto"? I don't expect cryptographic operations to consume significant memory, or really anything but O(input). Usually the concern with handshakes is CPU usage, which could in principle be measured with some sort of high resolution timer around accept and, most importantly, the poll calls on the internal connection driver future.
Can you elaborate on "mem usage from crypto"? I don't expect cryptographic operations to consume significant memory, or really anything but O(input). Usually the concern with handshakes is CPU usage, which could in principle be measured with some sort of high resolution timer around
acceptand, most importantly, thepollcalls on the internal connection driver future.
I think that's (CPU usage) also something to look at as well. We're looking specifically at x509 verification, as it happens before we get the future returned to us when we accept() making it a prime target for rate limiting given some data. It would also allow us to determine once and for all, whether crypto does indeed take a significant amount of memory under attack (e.g. getting spammed with lots of handshake attempts, or some variation of this), which we suspect from a recent security advisory, but don't know for sure.
In general I think a quinn_proto::Connection takes a decent amount of memory, but only a small part of that is due to the crypto involved. To the extent that you can, it's probably better to reject connections based on the Incoming data.
We're looking specifically at x509 verification, as it happens before we get the future returned to us when we accept() making it a prime target for rate limiting given some data.
This is no longer the case as of 0.11. No CPU-intensive work is performed until you accept a connection.
CC @t-nelson for input. From the above, it's a little difficult to directly measure memory usage from x509 verification and apparently there are decent ways to rate-limit this in newer versions anyway. I also understand that it was requested to track various state transitions in the handshake, but this doesn't look super feasible. In any case, perhaps we should just get the interface in and look at adding more fields later as the need arises?
Hmm, I'm unable to merge. Perhaps it's because "@Ralith requested changes"?
You won't be able to merge either way, only we do. But, I'll wait for @Ralith's approval.