asyncio.Task.cancel documentation is inaccurate
Problem
The documentation on asyncio.Task.cancel says
This arranges for a CancelledError exception to be thrown into the wrapped coroutine on the next cycle of the event loop.
The coroutine then has a chance to clean up or even deny the request by suppressing the exception with a try … … except CancelledError … finally block. Therefore, unlike Future.cancel(), Task.cancel() does not guarantee that the Task will be cancelled, although suppressing cancellation completely is not common and is actively discouraged.
These lines appear to suggest that a CancelledError will always be raised into the coroutine, which is not true -- if cancel() is called before the Task starts running, the task is guaranteed to be cancelled -- the coroutine is not given a chance to run at all and no exception is thrown into it.
Example
The following program prints nothing on stdout and exits with code 1.
import asyncio
async def test():
try:
print("test() is called!")
await asyncio.sleep(1)
except asyncio.CancelledError:
print("test() is cancelled!")
async def main():
task = asyncio.create_task(test())
task.cancel()
await task
asyncio.run(main())
Suggestion
Clarify that cancelling a task that has not started is always "effective" and will not cause an exception to be thrown into the coroutine.
Specifically, the quoted paragraphs might be amended as follows (key changes in bold):
If the Task has not started, it will be done with CancelledError and will not be started.
If the Task has started, a CancelledError exception will be thrown into the wrapped coroutine on the next cycle of the event loop. The coroutine then has a chance to clean up or even deny the request by suppressing the exception with a try … … except CancelledError … finally block. Therefore, unlike Future.cancel(), Task.cancel() does not guarantee that a started Task will be cancelled, although suppressing cancellation completely is not common and is actively discouraged.
It is because the exception is thrown into the code before the first try. If you want to add a note, please send a PR.
I find the proposed clarification useful, but I think the task state "not started yet" should be somehow defined or explained. As a developer I had to learn it from other sources and would prefer if the documentation mentioned when a task is started. AFAIK, the scheduler code must run and that requires an await of a future that is not done yet. Maybe a tip could be added that await asyncio.sleep(0) after asyncio.create_task will start the created task.
@xitop, good suggestion. A somewhat precise definition is that a task scheduled by create_task is started in the next event loop iteration, after callbacks scheduled before the create_task call are completed.
So the “precise definition” involves more concepts to define :-)
For example, calling asyncio.sleep(0) ensures code in the current coroutine resumes after the task is started, but it doesn’t affect the scheduling of other coroutines and callbacks.
I don’t think we should promise it will run in the next event loop iteration, or after what’s already scheduled. But we should make clear that it doesn’t run before the next iteration.
Why should we not say that the created task will start in the next event loop iteration? After all, it is scheduled by call_soon and the doc on call_soon says:
Schedule the callback callback to be called with args arguments at the next iteration of the event loop.
I understand that the loop may be stopped or interrupted by KeyboardInterrupt or SystemExit, in which case the “next” loop iteration may not come “next”, so to speak.
Would it be helpful to say the scheduled task will start as “soon” as call_”soon”?
I don’t want to commit to implementation details. Even call_soon is carefully named. :-)
Ideally, if one were to create thousands of tasks, we should be allowed to check for I/O before they’ve all been started, if we’ve figured out that that improves some aspect of a server such as throughput, responsiveness, or whatever.
The key point here is really to make it clear that it won’t start right away (not even at the next ‘await’ unless it blocks for I/O or time).
I understand that each sentence in the docs must be carefully chosen. I am not suggesting any text, because my knowledge level of the English language and the asyncio library internals simply prohibits that. That's why I wrote "somehow define or explain",
The core of this proposal is to make programmers aware that an exception might be raised NOT inside task's coroutine, because the task hasn't been started yet. Based on my previous experience, all I am asking for is please address in some way also the implied programming questions "(1) so, when it will be started?" or "(2) how can I make sure my task has been started?" or maybe (not sure about this one) "how can I test my task was started?".
If the answer to question (2) really is as simple as await asyncio.sleep(0) then from a practical point of view an imprecise answer to question (1) like "when the scheduler decides to do so" is sufficient.
According to @gvanrossum’s comments, the task is expected to start after the current loop iteration but before any blocking I/O polling.
The current implementation starts the task in the next loop iteration, but that’s considered an implementation detail rather than specification.
@xitop, for your question 2, yes, after calling create_task, you need to relinquish control to the asyncio event loop for it to have a chance to start the task (or run any callback). await anything would do, as well as return, for example. Does this answer your question?
@fancidev Not 100% why are you asking. To make it simple, let me summarize. This issue with its PR explains what happens if a task was not started yet. I know that programmer's life would be simpler without the need to distiguish those two cases (started or not). If everybody agrees, please add a sentence somewhere (where it is logical) basically saying that await asyncio.sleep(0) ensures freshly created tasks are not only scheduled, but also started. (Other await ... may or may not do that.). Thank you.
But await asyncio.sleep(0) does not ensure the task is started… it’s just one of the ways to make it possible for a task to get started “soon”; other ways being await anything (such as the created task), return, or raise.
Put another way, even if you await asyncio.sleep(0) immediately after create_task, it is still possible to cancel the task before it starts (e.g. if you cancel the task from an callback that happened to be scheduled right after the coroutine that called create_task)
Correction: awaiting anything is not enough… the thing being awaited need to have a suspension point in it soon, so asyncio.sleep(0) is good, so as Futures that are not yet done, etc.
The question is now: does asyncio.sleep(0) reliably start scheduled tasks? I thought it exists for this purpose: "Setting the delay to 0 provides an optimized path to allow other tasks to run".
If asyncio does not provide a reliable way to start tasks, I have nothing helpful to add to this PR.
FWIW the rest of the (long) discussion is happening on the PR, GH-98321.