CancelledError suppressed where it should not
Describe the bug
The ClientRequest.close method suppresses asyncio.CancelledError:
https://github.com/aio-libs/aiohttp/blob/006fbe03fede4eaa1eeba7b8393cbf4d63cb44b6/aiohttp/client_reqrep.py#L680
This is incorrect as this error does not always bubble up from the inside _writer task. It can also bubble down from the outside (if the task executing this ClientRequest.close call is canceled by another party). Suppressing the later case makes the aiohttp client unreliable as it may not honor cancellation demands from the outside.
To Reproduce
Read the code above. A specific test case may be built.
Expected behavior
A specific utility function shall be used, to be able to catch only CancelledError bubbling up from the inside _writer task and let all other exceptions (whether bubbling down or up) go through.
As an example, here is such a utility method:
T = typing.TypeVar("T")
def absorb_inner_cancel(
awaitable: collections.abc.Awaitable[T],
) -> collections.abc.Awaitable[None | T]:
"""Wait for the given awaitable to be completed and return its result,
and absorb any CancelledError that did not originate from this call (inner
cancel), returning None in such case. Let other exceptions pass through
unchanged. In particular, a cancellation performed on this call (outer
cancel) is honored and propagated as usual.
"""
inner = asyncio.ensure_future(awaitable)
if inner.done():
# The awaitable is already done, perform directly the cancel absorption
if inner.cancelled():
# The awaitable was already cancelled, absorb it
done_future: asyncio.Future[None] = asyncio.Future()
done_future.set_result(None)
return done_future
# The awaitable had no cancel, return it directly
return awaitable
# The awaitable is not done yet, plug the callback to catch and absorb
# inner cancels appropriately
outer = _AbsorbFuture(inner)
def on_inner_done(inner: asyncio.Future[T]) -> None:
if outer.done():
return
if inner.cancelled():
outer._complete_cancel() # pylint: disable=W0212
return
exception = inner.exception()
if exception is not None:
outer.set_exception(exception)
return
result = inner.result()
outer.set_result(result)
inner.add_done_callback(on_inner_done)
return outer
class _AbsorbFuture(asyncio.Future[None | T]):
__slots__ = ("_inner_future", "_cancel_requested", "_cancel_msg")
def __init__(self, inner_future: asyncio.Future[T]) -> None:
super().__init__()
self._inner_future = inner_future
self._cancel_requested = False
self._cancel_msg: None | str = None
def cancel(self, msg: None | str = None) -> bool:
if self.done():
return False
cancelled = self._inner_future.cancel(msg=msg)
if not cancelled:
return False
self._cancel_requested = True
self._cancel_msg = msg
return True
def _complete_cancel(self) -> None:
if self._cancel_requested:
# This cancel was requested by this future (outer cancel), so we
# confirm it
super().cancel(msg=self._cancel_msg)
else:
# This cancel was not requested by this future but has bubbled up
# from the inner future (inner cancel), so we absorb it
self.set_result(None)
This code may seem complex, but it does cover all the edge cases as specified. This is very close to the code of asyncio.gather, just faster and much simpler as we are waiting on a single awaitable here.
It can be used this way to wrap the _writer field: in place of https://github.com/aio-libs/aiohttp/blob/006fbe03fede4eaa1eeba7b8393cbf4d63cb44b6/aiohttp/client_reqrep.py#L661 have
self._writer = absorb_inner_cancel(self.write_bytes(writer, conn))
Logs/tracebacks
None
Python Version
$ python --version
Python 3.11.6
aiohttp Version
$ python -m pip show aiohttp
Name: aiohttp
Version: 3.9.3
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author:
Author-email:
License: Apache 2
Location: /opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.11/site-packages
Requires: aiosignal, attrs, frozenlist, multidict, yarl
Required-by: aiohttp-jinja2, asyncpraw, asyncprawcore, datasets, httpstan, pystan
multidict Version
$ python -m pip show multidict
Name: multidict
Version: 6.0.5
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.11/site-packages
Requires:
Required-by: aiohttp, yarl
yarl Version
$ python -m pip show yarl
Name: yarl
Version: 1.9.4
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache-2.0
Location: /opt/homebrew/Caskroom/miniforge/base/envs/work/lib/python3.11/site-packages
Requires: idna, multidict
Required-by: aiohttp, asyncprawcore
OS
MacOS Sonoma 14.2.1
Related component
Client
Additional context
No response
Code of Conduct
- [X] I agree to follow the aio-libs Code of Conduct
This is something we just fixed in aiohttp-sse. Should probably have a look through all our code to find similar issues.
As in aiohttp-sse, I'd be fine with a fix that only works on Python 3.11, as it's awkward to fix on older versions and has been like that forever: https://github.com/aio-libs/aiohttp-sse/pull/459/files
Further information: https://superfastpython.com/asyncio-task-cancellation-best-practices/#Practice_1_Do_Not_Consume_CancelledError
I have read your patch ( https://github.com/aio-libs/aiohttp-sse/pull/459/files ). However I am not sure it is safe. It assumes that, if the current task is cancelled, then it is due to the same exception we are just processing in the exceptbranch, as it re-raises it. I am not sure this has no consequence. My experience with asyncio has been that some innocent assumptions can lead to surprising edge cases.
I wrote initially that I thought that my code was safer. But, reading your code that seems very logical and simple (even if requiring python >= 3.11), I am not sure. Maybe they are equivalent technically.
That would be two versions available for this patch:
- Your version which is much more simple (kudos for this) but requires python >= 3.11
- Mine which is universal (as long as asyncio is available) but takes more code to do the same, as it needs to go down into the asyncio Future machinery - yet despite the appearance it is not exotic (this code is like
asyncio.gather, just simpified)
It assumes that, if the current task is cancelled, then it is due to the same exception we are just processing in the
exceptbranch, as it re-raises it.
I suggest reading the linked article. The current task is cancelling because it is being asked to cancel (i.e. cancelled from outside the task). In this situation, the cancellation should be propagated. If the inner task has been cancelled and we are receiving the exception as it bubbles up, then the current task is not cancelling(). There are no assumptions in that logic.
To avoid maintaining rather verbose, complex code, I'd personally prefer to just fix this on 3.11+. I think you're the first user to actually report this, so I'm not too worried about the behaviour existing in the oldest versions of Python for a little while longer, there are similar issues in asyncio itself in those versions regardless.
I agree. Your code is much (much) more maintainable and forward-looking.