wfe: add per-endpoint inflight limit
We often have very short-lived load spikes at 00:00 UTC. When these happen, performance degrades. One of the bottlenecks is database capacity. During a spike, we often spin up a lot of new database connection, which takes wall-clock time and also consumes resources on the database server. If we could spread out the load, even on the order of a few seconds, we could get better reuse out of existing connections and reduce the maximum number of queries queued in the database for execution.
One way to do this is: apply a limit in the WFE on the number of inflight requests for any given endpoint. When the limit is exceeded, new requests to that endpoint block until a slot is available or the request's context hits a deadline. There are various ways to implement this pattern; one simple one is a channel with capacity equal to the limit. Another way to implement this pattern would be type limiter struct { sync.Mutex; count int; limit int; ready chan struct{} }.
Why per-endpoint? Some endpoints require more work than others, and some endpoints actually remove work. For instance, a successful response to an authz polling request means we get no more requests for that authz, while a successful request to new-order means additional requests soon to follow. So we prefer to put tighter limits on certain endpoints, like new-order.
This approach is not perfect, since the limit state is tracked per-WFE rather than globally. However, it's quick to implement and doesn't require external services, and load is probably distributed approximately evenly across WFEs.
It would be particularly nice to be able to change this value without restarting the WFEs, since the WFEs take some time to cleanly shutdown. One possibility: We can use the reloader functionality we already use for rate limits, ECDSA allow list, and hostname policy. Then shipping a new file would change the limits without requiring a restart. Alternately, we could pull periodically from some service, like Consul, MariaDB, or Redis.