trio
trio copied to clipboard
Add trio.from_thread.check_cancelled api to allow threads to efficiently poll for cancellation
This PR reuses the thread local object in _threads.py
to stash the raise_cancel
provided to abort, which basically implements a suggestion by @njsmith in https://github.com/python-trio/trio/issues/961#issuecomment-501597755:
Another idea: in
run_sync_in_thread
, start the thread, then block inwait_task_rescheduled
. In our blocked state, if our >abort_fn
is called, stash theraise_fn
somewhere where the thread can see it. Thentrio.from_thread.check_cancel()
is just >something likeif THREAD_STATE.raise_fn is not None: raise_fn()
.
This might be simpler, but I bet it will also have some complications by the time we figure out how to forward arbitrary >cancellations to the child thread, and also let the thread re-enter trio and deliver cancellations to those tasks.
I would like to use this to allow a sqlite query to poll for cancellation. Currently, I would have to dig into the internals and use this recipe from @oremanj:
def make_cancel_poller():
_cancel_status = trio.lowlevel.current_task()._cancel_status
def poll_for_cancel(*args, **kwargs):
if _cancel_status.effectively_cancelled:
raise trio.Cancelled._create()
return poll_for_cancel
Furthermore, that method is not thread safe if shielding is being toggled:
https://github.com/python-trio/trio/blob/2d62ff037f73ffd129485809ac0f14001e1eab6d/trio/_core/_run.py#L223-L231
Codecov Report
Merging #2392 (ab092b0) into master (b161fec) will increase coverage by
0.00%
. The diff coverage is100.00%
.
Additional details and impacted files
@@ Coverage Diff @@
## master #2392 +/- ##
========================================
Coverage 99.13% 99.14%
========================================
Files 115 115
Lines 17242 17432 +190
Branches 3085 3107 +22
========================================
+ Hits 17093 17283 +190
Misses 104 104
Partials 45 45
Files | Coverage Δ | |
---|---|---|
trio/_tests/test_threads.py | 100.00% <100.00%> (ø) |
|
trio/_threads.py | 100.00% <100.00%> (ø) |
|
trio/from_thread.py | 100.00% <100.00%> (ø) |
I'd like to see threads be able to run code directly in the nursery that started it, this should fix this issue more robustly.
I'll edit this comment to link to what I mean shortly. Remind me if I don't
Is this similar to what you are suggesting: https://github.com/python-trio/trio/issues/606#issuecomment-515667078
@graingert I have implemented your request (finally!). Would you have a look at it? Also, @agronholm this might have implications for AnyIO so it would be better to have your input sooner than later!
This looks very interesting, and I can use this. I'm looking forward to adding this to AnyIO once it's released for Trio.
it should be type-safe now, but i'm a typing newbie so maybe I could get a review on that aspect as well?
What does this do if the thread call gets "uncancelled"? Ie this sequence:
- Some outer enclosing scope enters the cancelled state, so
run_sync
becomes cancelled- A more-inner enclosing scope gets mutated as
cscope.shield = True
, sorun_sync
is no longer cancelled- Then the user calls
check_cancelled
The straightforward implementation was to make this a level change in cancellation for the thread, so check_cancelled
forever raises Cancelled
IFF the cancellation delivery was attempted at least once. I believe the docs reflect this, though maybe that needs to be clarified.
I think this is also the "best" behavior because it is an approximation of what would happen if the thread were blocked in trio.from_thread.run(afn)
. In that case, even in your scenario, a Cancelled
would come out of afn
and take an indeterminate amount of time to propagate back to the host task, meaning that it could be delivered after raising shields.
also if the user has selected cancellable=True
, the previous behavior would be to wake up and abandon the thread. I wanted check_cancelled
to be a tool for abandoned threads to end early and should not depend on the host task's cancel status.
What does this do if the thread call gets "uncancelled"? Ie this sequence:
- Some outer enclosing scope enters the cancelled state, so
run_sync
becomes cancelled- A more-inner enclosing scope gets mutated as
cscope.shield = True
, sorun_sync
is no longer cancelled- Then the user calls
check_cancelled
Could you explain how the host task would get un-cancelled when it's still waiting on run_sync()
? And if it gets cancelled, then this is pretty clear: the thread gets the cancelled flag set too, no matter what happens to the host task afterwards.
What does this do if the thread call gets "uncancelled"?
I'd regard support for un-cancellation as strictly optional, doubly so regarding threads.
IMHO a cancel scope either shields, or it doesn't. Changing its mode after the fact smells like a race condition waiting to happen.
I'd be happy to hear suggestions for applying the context in from_thread.run that could avoid two extra checkpoints!
That's the only available way to change a task's context without creating a new task, and it's probably lower-overhead than the new task approach (but I didn't measure). The new task approach would be to push the reentrant call into a system task, which can have whatever context you want, and wrap it in a cancel scope that gets cancelled when the to_thread.run_sync
call gets cancelled. I probably prefer the approach here where the reentrant functions run underneath the original task, but they're both workable.
What does this do if the thread call gets "uncancelled"?
from_thread.check_cancelled()
: The initial cancellation will call to_thread.run_sync's abort_fn, which will save the raise_cancel and use it the next time the thread checks for cancellation; the "uncancellation" will not be noticed. I think this is fine; it's well-defined, deterministic modulo the question of whether the thread checks for cancellation at all in between when cancellation is asserted and when the thread completes (which is a source of nondeterminism even without shielding), and the Cancelled exception will propagate fine through the shield (we have and test a similar edge case in the non-threads world).
from_thread.run()
: The function that runs upon reentry is executed as a normal async call in the to_thread.run_sync
task, so it does see the uncancellation. I'm realizing that this results in a discrepancy between from_thread.run(trio.sleep, 0)
and from_thread.check_cancelled()
, which is a bit unfortunate, and probably we should document it, but I don't think it's a problem in practice. from_thread.check_cancelled()
can't observe that the 'uncancellation' has occurred at all unless it pokes into internals (the CancelStatus object). There's no way I know of to be woken up if 'uncancellation' occurs, so the main to_thread.run_sync
task can't help it out. Maybe inspecting CancelStatus is fine/worthwhile? Doesn't quite seem worth it to me (it breaks our rules about how non-_core
code is supposed to consume _core
code), but it's a near thing. But it would be easy to add later if we decide to; I'm comfortable taking the simpler road for now.
There was a comment upthread that said inspecting CancelStatus wouldn't be thread-safe, but I don't think that's actually a problem: if the check is racing with the uncancellation then AFAICT it doesn't really matter whether the check sees the uncancellation or not. However my earlier snipept (which would raise trio.Cancelled._create()
if the CancelStatus showed a cancelled status) was incorrect because it didn't handle the KeyboardInterrupt possibility. So we need the current raise_cancel
logic regardless; any CancelStatus check for uncancellation would be in addition to that.
from_thread.check_cancelled()
: The initial cancellation will call to_thread.run_sync's abort_fn, which will save the raise_cancel and use it the next time the thread checks for cancellation; the "uncancellation" will not be noticed. I think this is fine; it's well-defined, deterministic [...]
I think there may be another nondeterministic edge case: if the cancellation arrives while the task is busy in await message.run()
AND that work happens to be shielded, AND a shield gets raised uptree, then the main "have we ever been cancelled" check will not be triggered. I think a nursery with a task with the sole purpose of waiting for cancellation might be needed to get that exactly right.
As long as we need a nursery to get those semantics right, we might as well also consider an alternative implementation where the each from_thread
call spawns a new task in that nursery. After all, it's not really the fact that we are reusing the task which is so useful, but rather that it exists in the right cancellation, contextvar, and treevar contexts. Child tasks would have all these properties, although they might be marginally less efficient.
I think there may be another nondeterministic edge case: if the cancellation arrives while the task is busy in await message.run() AND that work happens to be shielded, AND a shield gets raised uptree, then the main "have we ever been cancelled" check will not be triggered. I think a nursery with a task with the sole purpose of waiting for cancellation might be needed to get that exactly right.
I think it depends what you mean by "exactly right". :-) The analogous situation without the thread would not raise a cancellation in this case, so I don't think the thread needs to either. In any event I don't think it would be worth the trouble and overhead of a whole separate task to wait for cancellation.
In general the way Trio behaves is that you need to execute a checkpoint while you are effectively cancelled (have a cancelled cancel scope closer to you than the closest shielded cancel scope) in order to raise Cancelled. If you perform a cancellation and then a shielding in the way that causes the effectively-cancelled status to toggle back and forth, but no checkpoints observe the "cancelled" state, then for your purposes it didn't happen. I think this analogizes nicely to threads if we take the "checkpoints" to be calls to check_cancelled()
+ (unshielded) checkpoints in reentrant calls to from_thread.run()
functions. We can only get that exact behavior if we peek inside the internal CancelStatus object on check_cancelled
; otherwise we will get a more conservative / "cancel-happy" behavior where check_cancelled
will raise Cancelled if the to_thread.run_sync()
call has ever been effectively cancelled, even if it's not anymore. Which is also reasonable! We just need to document the subtleties here.