`Context` semantics for cross-actor-task cancellation and overruns
Finally!
This is a rigorous rework of our Context and ContextCancelled
semantics to be more like trio.CancelScope
for cross-actor linked tasks!
High level mechanics:
- if any actor-task requests a remote actor to be cancelled, either by
calling
Actor.cancel()or the internalActor._cancel_stask()(which is called byContext.cancel()), that requesting actor's uid is set on the target far-endContext._cancel_called_remoteand any eventually raisedContextCancelled.canceller: tuple. - this allows any actor receiving a
ContextCancellederror-msg to check if they were the cancellation "requester", in which case the error is not raised locally and instead the cancellation is "caught" (and silently ignored inside thePortal.open_context()block, and possibly@tractor.contextcallee func) just like normaltriocancel scopes: if you callcs.cancel()inside some scope or nursery block. - if any received
ContextCancelled.canceller != current_actor().uid, then the error is raised thus making it explicit that the cancellation was not anticipated by the cancel-receiving (actor) task and thus can be subsequently handled as an (unanticipated) remote cancellation.
Also included is new support and public API for allowing contexts (and their streams) to "allow overruns" (since backpressure isn't really possible without a msg/transport protocol extension): enabling the case where some sender is pushing msgs faster then the other side is receiving them and no error is received on the sender side.
Normally (and by default) the rx side will receive (via
RemoteActorError msg) and subsequently raise a StreamOverrun that's
been relayed from the sender. The receiver then can decide how to the
handle the condition; previously any sent msg delivered that caused an
overrun would be discarded.
Instead we now offer a allow_overruns: bool to the
Context.open_stream() API which adds a new mode with the following
functionality:
- if an overrun condition is hit, we spawn a single "overflow drainer"
task in the
Context._scope_nursery: trio.Nurserywhich runsContext._drain_overflows()(not convinced of naming yet btw :joy:): - this task has one simple job: it does the work of calling
await self._send_chan(msg)for every queued up overrun-condition msg (stored in the newContext._overflow_q: deque) in a blocking fashion but without blocking the RPC msg loop. - when the
._in_overrun: boolis already set we presume the task is already running and instead new overflow msgs are queued and a warning log msg emitted. - on
Contextexit any existing drainer task is cancelled and lingering msg discarded but reported in a warning log msg.
Details of implementation:
- move all
Contextrelated code to a new._contextmod. - extend the test suite to include full coverage of all cancel request and overruns (ignoring) cases.
- add
ContextCancelled.canceller: tuple[str, str]. - expose new
Context.cancel_called,.cancelled_caughtand.cancel_called_remoteproperties. - add
Context._deliver_msg()as the factored content from what wasActor._push_result()which now calls the former. - add a new
mk_context() -> Contextfactory.
Possibly still todo:
- [ ] is maybe
no_overrunsorignore_overrunsor something else a better var name? - [ ] better
StreamOverrunmsg content? - [ ] make
Context.result()returnNoneif the default is never set, more like a function that has no return? - [ ] drop lingering
._debug.breakpoint()comments from._context.py? - [ ] ensure we don't need the
cs.shield = Truethat was insideContext.cancel()? - [ ] get more confident about dropping the previous
Context._send_chan.aclose()calls (currently commented out)? - [ ] not use
@dataclassforContextand instead gomsgspec.Struct? - [ ] KINDA IMPORTANT maybe we should allow relaying the source
error traceback to cancelled context tasks which were not the cause of
their own cancellation?
- i.e. the
ContextCancelled.canceller != current_actor().uidcase where it might be handy to know why, for eg., some parent actor cancelled some other child that you were attached to, instead of just seeing that it was cancelled by someone else?
- i.e. the
- [ ] apparently on self-cancels (which i still don't think is fully tested..?) we can get a relayed error that has no
.cancellerset?- see this example from the new remote annotation testing in
piker- which when you inspect the input
errtoContext._maybe_raise_remote_err()
- which when you inspect the input
- see this example from the new remote annotation testing in