envoy icon indicating copy to clipboard operation
envoy copied to clipboard

QUIC hot restart part 6 - child instance pauses listening until parent is drained

Open ravenblackx opened this issue 1 year ago • 7 comments

Commit Message: QUIC hot restart part 6 - child instance pauses listening until parent is drained Additional Description: Make the child instance not read from the UDP socket until the parent instance stops reading. From the prior changes, the parent instance forwards UDP packets it doesn't recognize to the child instance via a side-channel, so not reading doesn't mean not responding - it just means not intercepting packets which may be intended for the parent instance.

In this change there is no drain optimization - if the parent has no more live connections we will still wait and forward packets until the drain time completes. I think it is most likely not worth optimizing for that, since the eventual goal is to migrate the QUIC-packet-sorting behavior onto eBPF at which point, for systems that care about performance, all of this manual forwarding support will be unused anyway.

I've tried to keep entirely of the performance hot path, so the change only touches constructors and enable/disable operations and stays entirely out of the read path.

It's awkward because the point at which we detect that the socket is being forwarded (the creation of the UdpListenSocket) the UdpListener and ActiveQuicListener don't exist, but those are where we need to disable or enable reading. Also awkward is the potential race - the unpause callback can potentially arrive after the listener was deleted, and must go to the right thread, so there's a bit of a tangle ensuring that that's the case - for this we still call the callback, but abort if the listener was deleted by the time the dispatcher tries to perform the action.

Finally, there's a potential race of disable/enable and this paused state, so I've added enough state to the UdpListener that the sequence start_paused->disable->unpause does not accidentally end with the listener enabled.

Risk Level: Some risk, this is touching quite a bit of core code. The touches should mostly not do anything except to QUIC listeners, so the risk isn't as big as it looks. Testing: Unit tests provide coverage; I intend to augment hotrestart_handoff test to make sure this is fully doing what it needs to do for QUIC listeners. Docs Changes: Release Notes: Platform Specific Features:

ravenblackx avatar Nov 30 '23 18:11 ravenblackx

As a reminder, PRs marked as draft will not be automatically assigned reviewers, or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/31130 was opened by ravenblackx.

see: more, trace.

I think this is a reasonable approach overall. But I think you'll need a way to un-register for these callbacks, in the case that the quic listener is removed via LDS after the child has started, but before the parent has finished draining.

That's handled as-is by the callback self-aborting, rather than by unregistering. I don't think unregistering can work reliably because the destructor may not be called on the thread that owns the registry (it could work but would require a network of locks and the behavior would become much less clear if destruction and unpausing are trying to be concurrent - having the callback abort on-thread if the destructor has happened means the "race" occurs exclusively in the worker thread, so there can be no attempted concurrency between destruction and notification).

ravenblackx avatar Dec 13 '23 22:12 ravenblackx

I think this is a reasonable approach overall. But I think you'll need a way to un-register for these callbacks, in the case that the quic listener is removed via LDS after the child has started, but before the parent has finished draining.

I'd also like to get at least one more opinion on this, maybe oneof @wbpcode @soulxu @alyssawilk @yanavlasov.

Or we only register the thread worker as the callbacks, then we can use the existing disableListeners/enableListeners method (maybe add a new flag to disable all the udp listener) of ConnectionHandler to pause/unpause the listener. https://github.com/envoyproxy/envoy/blob/d797293ee8f31328ae664304ad6d6ef85d2e4f2c/source/common/listener_manager/connection_handler_impl.h#L56-L57

soulxu avatar Dec 14 '23 09:12 soulxu

I think this is a reasonable approach overall. But I think you'll need a way to un-register for these callbacks, in the case that the quic listener is removed via LDS after the child has started, but before the parent has finished draining. I'd also like to get at least one more opinion on this, maybe oneof @wbpcode @soulxu @alyssawilk @yanavlasov.

Or we only register the thread worker as the callbacks, then we can use the existing disableListeners/enableListeners method (maybe add a new flag to disable all the udp listener) of ConnectionHandler to pause/unpause the listener.

The problem with using the existing functions is there are already calls to those functions which could potentially make something racily do the wrong thing, e.g. if a listener starts up and is disabled by a call to disableListeners for hot restart purposes, then is disabled for some other reason, then hot restart draining completes and calls enableListeners, now the socket would be listening despite the other caller of disableListeners expecting it not to be. Tracking it separately ensures that the socket is only reading if both things want it to be.

So my rationale for separating out this paused state is that it can play well with other enable/disable operations with no risk of conflict. Another way to achieve this would be to add a "disabledness" counter so that two disables must be matched to two enables, but knowing whether that would be safe or not would require digging a lot more into existing code to ensure it doesn't currently expect one enable to be capable of overriding two prior disables.

ravenblackx avatar Dec 26 '23 15:12 ravenblackx

/coverage

ravenblackx avatar Jan 10 '24 22:01 ravenblackx

Coverage for this Pull Request will be rendered here:

https://storage.googleapis.com/envoy-pr/31130/coverage/index.html

The coverage results are (re-)rendered each time the CI envoy-presubmit (check linux_x64 coverage) job completes.

:cat:

Caused by: a https://github.com/envoyproxy/envoy/pull/31130#issuecomment-1885867427 was created by @ravenblackx.

see: more, trace.

/retest

ravenblackx avatar Jan 16 '24 17:01 ravenblackx

/retest

ravenblackx avatar Feb 29 '24 17:02 ravenblackx