aiohttp
aiohttp copied to clipboard
(client) connections cannot be really closed before EOF with uvloop enabled
Describe the bug
If uvloop
is enabled, as long as EOF is not reached, even if the response, connection, and session are all closed, the server from which aiohttp requested resources, will still keep sending packets to the aiohttp-client for some time. However, the bug is only reproducible with uvloop enabled.
To Reproduce
aiohttp[speedups]==3.8.1
uvloop==0.16.0
import asyncio
import aiohttp
import uvloop
import logging
logging.basicConfig(level=logging.DEBUG)
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
logging.debug(response)
logging.debug(response.content)
async def reproduce(wait):
await fetch('https://sgp-ping.vultr.com/vultr.com.1000MB.bin')
await asyncio.sleep(wait) # now you will see that the file is still being downloaded
def main(enable_uvloop):
if enable_uvloop:
logging.info('********** Using uvloop **********')
uvloop.install() # the bug is only reproducible with uvloop
else:
logging.info('********** Using asyncio **********')
loop = asyncio.new_event_loop()
loop.set_debug(True) # or set env PYTHONASYNCIODEBUG=1, not necessary to reproduce the bug
asyncio.set_event_loop(loop)
loop.run_until_complete(reproduce(15 if enable_uvloop else 5))
loop.close()
if __name__ == '__main__':
main(enable_uvloop=False)
main(enable_uvloop=True)
input('Press Enter to exit')
Expected behavior
The connection should be closed and no packet from the server could be sent to the client anymore, no matter with or without uvloop
enabled, no matter reaching or not reaching EOF.
Logs/tracebacks
INFO:root:********** Using asyncio **********
DEBUG:asyncio:Using selector: EpollSelector
DEBUG:asyncio:Get address info sgp-ping.vultr.com:443, type=<SocketKind.SOCK_STREAM: 1>, flags=<AddressInfo.AI_ADDRCONFIG: 32>
INFO:asyncio:Getting address info sgp-ping.vultr.com:443, type=<SocketKind.SOCK_STREAM: 1>, flags=<AddressInfo.AI_ADDRCONFIG: 32> took 209.423ms: [(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('45.32.100.168', 443))]
DEBUG:asyncio:<asyncio.sslproto.SSLProtocol object at 0x7fc80064dfa0> starts SSL handshake
DEBUG:asyncio:<asyncio.sslproto.SSLProtocol object at 0x7fc80064dfa0>: SSL handshake took 225.2 ms
DEBUG:asyncio:<asyncio.TransportSocket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.1.101', 51708), raddr=('45.32.100.168', 443)> connected to 45.32.100.168:443: (<asyncio.sslproto._SSLProtocolTransport object at 0x7fc7ff9b5a00>, <aiohttp.client_proto.ResponseHandler object at 0x7fc800205b20>)
DEBUG:root:<ClientResponse(https://sgp-ping.vultr.com/vultr.com.1000MB.bin) [200 OK]>
<CIMultiDictProxy('Server': 'nginx', 'Date': 'Sat, 21 May 2022 00:29:55 GMT', 'Content-Type': 'application/octet-stream', 'Content-Length': '1048576000', 'Last-Modified': 'Mon, 20 Sep 2021 19:54:01 GMT', 'Connection': 'keep-alive', 'Etag': '"6148e6d9-3e800000"', 'Expires': 'Sun, 22 May 2022 00:29:55 GMT', 'Cache-Control': 'max-age=86400', 'X-Frame-Options': 'DENY', 'X-Content-Type-Options': 'nosniff', 'Access-Control-Allow-Origin': '*', 'Accept-Ranges': 'bytes')>
DEBUG:root:<StreamReader 15968 bytes>
DEBUG:asyncio:<asyncio.sslproto.SSLProtocol object at 0x7fc80064dfa0>: SSL error in data received
Traceback (most recent call last):
File "/usr/lib/python3.9/asyncio/sslproto.py", line 534, in data_received
ssldata, appdata = self._sslpipe.feed_ssldata(data)
File "/usr/lib/python3.9/asyncio/sslproto.py", line 206, in feed_ssldata
self._sslobj.unwrap()
File "/usr/lib/python3.9/ssl.py", line 948, in unwrap
return self._sslobj.shutdown()
ssl.SSLError: [SSL: APPLICATION_DATA_AFTER_CLOSE_NOTIFY] application data after close notify (_ssl.c:2756)
DEBUG:asyncio:Close <_UnixSelectorEventLoop running=False closed=False debug=True>
INFO:root:********** Using uvloop **********
DEBUG:asyncio:<uvloop.loop.SSLProtocol object at 0x7fc8040f8b40> starts SSL handshake
DEBUG:asyncio:<uvloop.loop.SSLProtocol object at 0x7fc8040f8b40>: SSL handshake took 227.0 ms
DEBUG:root:<ClientResponse(https://sgp-ping.vultr.com/vultr.com.1000MB.bin) [200 OK]>
<CIMultiDictProxy('Server': 'nginx', 'Date': 'Sat, 21 May 2022 00:30:01 GMT', 'Content-Type': 'application/octet-stream', 'Content-Length': '1048576000', 'Last-Modified': 'Mon, 20 Sep 2021 19:54:01 GMT', 'Connection': 'keep-alive', 'Etag': '"6148e6d9-3e800000"', 'Expires': 'Sun, 22 May 2022 00:30:01 GMT', 'Cache-Control': 'max-age=86400', 'X-Frame-Options': 'DENY', 'X-Content-Type-Options': 'nosniff', 'Access-Control-Allow-Origin': '*', 'Accept-Ranges': 'bytes')>
DEBUG:root:<StreamReader 15968 bytes>
Press Enter to exit
Python Version
Python 3.9.12 (main, Mar 24 2022, 13:02:21) [GCC 11.2.0] on linux
aiohttp Version
Name: aiohttp
Version: 3.8.1
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author:
Author-email:
License: Apache 2
Location: /home/rongrong/venv/RSStT/lib/python3.9/site-packages
Requires: aiosignal, async-timeout, attrs, charset-normalizer, frozenlist, multidict, yarl
Required-by: aiohttp-retry, aiohttp-socks
multidict Version
Name: multidict
Version: 6.0.2
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /home/rongrong/venv/RSStT/lib/python3.9/site-packages
Requires:
Required-by: aiohttp, yarl
yarl Version
Name: yarl
Version: 1.7.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /home/rongrong/venv/RSStT/lib/python3.9/site-packages
Requires: idna, multidict
Required-by: aiohttp
OS
_,met$$$$$gg. rongrong@envy
,g$$$$$$$$$$$$$$$P. -------------
,g$$P" """Y$$.". OS: Debian GNU/Linux bookworm/sid x86_64
,$$P' `$$$. Host: HP ENVY x360 Convertible 15-dr1xxx
',$$P ,ggs. `$$b: Kernel: 5.17.0-1-amd64
`d$$' ,$P"' . $$$ Uptime: 7 hours, 43 mins
$$P d$' , $$P Packages: 3453 (dpkg)
$$: $$. - ,d$$' Shell: bash 5.1.16
$$; Y$b._ _,d$P' Resolution: 1080x1920, 1920x1080
Y$$. `.`"Y$$$$P"' DE: Plasma 5.24.5
`$$b "-.__ WM: KWin
`Y$$ WM Theme: Breeze 微风
`Y$$. Theme: Breeze Light [Plasma], Breeze [GTK2/3]
`$$b. Icons: breeze-dark [Plasma], breeze-dark [GTK2/3]
`Y$$b. Terminal: konsole
`"Y$b._ CPU: Intel i7-10510U (8) @ 4.900GHz
`""" GPU: NVIDIA GeForce MX250
GPU: Intel CometLake-U GT2 [UHD Graphics]
Memory: 16156MiB / 31774MiB
Related component
Client
Additional context
The bug has also been reported to uvloop
: https://github.com/MagicStack/uvloop/issues/471
Code of Conduct
- [X] I agree to follow the aio-libs Code of Conduct
Some statistics in my production environment. (In my use case only the response headers and/or the beginning 4KB of the response body is needed.)
That probably explains some of the frequent test failures with uvloop. Would be great if someone figures out the fix.
I'd done some deep-dive and found that the bug is a TLS/SSL-relevant bug. Non-TLS connections can be closed even if uvloop
is enabled.
- await fetch('https://sgp-ping.vultr.com/vultr.com.1000MB.bin')
+ await fetch('http://sgp-ping.vultr.com/vultr.com.1000MB.bin')
I also analyzed the web traffic with Wireshark. With asyncio
, RST
is immediately sent once the connection was closed; but with uvloop
, closing the connection won't change anything as if is it not closed, RST
is only sent after the event loop is closed.
Another deep-dive:
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
logging.debug(response)
logging.debug(response.content)
+ await asyncio.sleep(5) # at this point, only a little part of the response is read, then pends
+ return # after the connection is "closed", uvloop will resume reading the response
A deeper dive shows the direct cause of the bug.
async def reproduce(wait):
await fetch('https://sgp-ping.vultr.com/vultr.com.1000MB.bin')
- await asyncio.sleep(wait) # now you will see that the file is still being downloaded
+ await asyncio.sleep(35) # the download will last for exactly 30s, then abort
aiohttp
closing its high-level connection causes uvloop
to call _SSLProtocolTransport.close()
.
https://github.com/MagicStack/uvloop/blob/3e71ddc3383a8e0546519117c8775323c19eb6d7/uvloop/sslproto.pyx#L42-L51
def close(self):
"""Close the transport.
Buffered data will be flushed asynchronously. No more data
will be received. After all buffered data is flushed, the
protocol's connection_lost() method will (eventually) called
with None as its argument.
"""
self._closed = True
self._ssl_protocol._start_shutdown(self.context.copy())
https://github.com/MagicStack/uvloop/blob/3e71ddc3383a8e0546519117c8775323c19eb6d7/uvloop/sslproto.pyx#L559-L574
cdef _start_shutdown(self, object context=None):
if self._state in (FLUSHING, SHUTDOWN, UNWRAPPED):
return
# we don't need the context for _abort or the timeout, because
# TCP transport._force_close() should be able to call
# connection_lost() in the right context
if self._app_transport is not None:
self._app_transport._closed = True
if self._state == DO_HANDSHAKE:
self._abort(None)
else:
self._set_state(FLUSHING)
self._shutdown_timeout_handle = \
self._loop.call_later(self._ssl_shutdown_timeout,
lambda: self._check_shutdown_timeout())
self._do_flush(context)
We can easily verify that the low-level connection is closed 30 seconds after aiohttp
have "closed" its high-level connection (as long as the response body is big enough). This complies with SSL_SHUTDOWN_TIMEOUT
. That is to say, it is SSLProtocol._ssl_shutdown_timeout()
that eventually closes the connection.
https://github.com/MagicStack/uvloop/blob/3e71ddc3383a8e0546519117c8775323c19eb6d7/uvloop/includes/consts.pxi#L24
DEF SSL_SHUTDOWN_TIMEOUT = 30.0
Now that SSLProtocol._ssl_shutdown_timeout()
is called, the conclusion is evident that SSLProtocol._do_flush()
causes the reading to be resumed but meanwhile is stuck in something until EOF or timeout is reached, causing the reading sustained.
https://github.com/MagicStack/uvloop/blob/3e71ddc3383a8e0546519117c8775323c19eb6d7/uvloop/sslproto.pyx#L600-L618
cdef _do_flush(self, object context=None):
"""Flush the write backlog, discarding new data received.
We don't send close_notify in FLUSHING because we still want to send
the remaining data over SSL, even if we received a close_notify. Also,
no application-level resume_writing() or pause_writing() will be called
in FLUSHING, as we could fully manage the flow control internally.
"""
try:
self._do_read_into_void(context)
self._do_write()
self._process_outgoing()
self._control_ssl_reading()
except Exception as ex:
self._on_shutdown_complete(ex)
else:
if not self._get_write_buffer_size():
self._set_state(SHUTDOWN)
self._do_shutdown(context)
Before uvloop
fixes the bug, here's a simple hotfix to stop uvloop
from reading the response happily and not willing to close the connection.
class AiohttpUvloopTransportHotfix(contextlib.AbstractAsyncContextManager):
def __init__(self, response: aiohttp.ClientResponse):
self.response = response
async def __aexit__(self, exc_type, exc_val, exc_tb):
if UVLOOP_ENABLED:
self.response.connection.transport.abort()
async def fetch(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
async with AiohttpUvloopTransportHotfix(response):
...
CC @Dreamsorcerer
Impressive deep dive. If this is a uvloop bug, have you reported it there? A link to it here would be useful for tracking.
I'd already opened a mirror issue before.
Additional context
The bug has also been reported to
uvloop
: https://github.com/MagicStack/uvloop/issues/471
Ah, great. I wonder if there's an easy way to put a temp fix in aiohttp, which would hopefully get our CI tests stable.
Maybe we can just add those lines into ClientSession, possibly even just monkey patching it in the test code.
Ah, great. I wonder if there's an easy way to put a temp fix in aiohttp, which would hopefully get our CI tests stable.
Maybe we can just add those lines into ClientSession, possibly even just monkey patching it in the test code.
Temporarily fixing it is not difficult, just call transport.abort()
instead of transport.close()
if the transport is a uvloop
SSL protocol transport. But there is indeed a lot to do since there are a lot of transport.close()
calls scattered in aiohttp
. I am not really sure which ones are not affected by the uvloop
bug. The problem also exists in the monkey-patching-in-test-code solution - when to monkey-patch and when not?
asyncio
: graceful close (general behavior)
uvloop
: hard abort (with my hotfix)
Hi @Rongronggg9, could this somehow lead to a memory leak? I'm facing an issue where after switching to aiohttp (3.8.1) we began getting high RAM consumption after spikes in requests but which doesn't drop afterwards.
We also use uvloop and make lots of TLS/SSL connections. I will try to test without uvloop to confirm.
@beesaferoot Hmmm... I don't think so, because uvloop will eventually force abort the connection after 30s as I've described before. Also, in my usage, either with/without uvloop or with/without my uvloop hotfix, aiohttp never shows any potential to leak memory.
My advice is to check if there are unclosed responses, sessions, or so on. Also, if you are using CPython built with Glibc (in most cases it is exactly what you are using on Linux), you should consider the possibility of memory fragmentation (for more details: https://github.com/kurtmckee/feedparser/issues/287).
I see, I will investigate inline with your suggestion. Thanks for the insight
It appears to be a fragmentation problem, as setting the env variable MALLOC_MMAP_THRESHOLD_
stops the ever increasing memory!
uvloop v0.17.0 was just released, but still buggy
I've also tested the vanilla asyncio in CPython 3.11.0rc1/rc2 since it has "borrowed"[^1] the SSL implementation from uvloop
. Still, the connection is closed as expected.
But a fatal error is logged:
DEBUG:asyncio:<asyncio.sslproto.SSLProtocol object at 0x7fdacf7879d0>: Fatal error on transport
Traceback (most recent call last):
File "/usr/lib/python3.11/asyncio/sslproto.py", line 646, in _do_shutdown
self._sslobj.unwrap()
File "/usr/lib/python3.11/ssl.py", line 983, in unwrap
return self._sslobj.shutdown()
^^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLError: [SSL: APPLICATION_DATA_AFTER_CLOSE_NOTIFY] application data after close notify (_ssl.c:2672)
I consider the fatal error is pretty OK:
- https://github.com/python/cpython/issues/84132
[^1]: https://github.com/python/cpython/issues/88177, https://github.com/python/cpython/pull/31275
So here locates the buggy code:
cdef _do_flush(self, object context=None):
try:
self._do_read_into_void(context)
self._do_write()
self._process_outgoing()
self._control_ssl_reading()
except Exception as ex:
self._on_shutdown_complete(ex)
else:
if not self._get_write_buffer_size():
self._set_state(SHUTDOWN)
self._do_shutdown(context)
cdef _do_shutdown(self, object context=None):
try:
# we must skip all application data (if any) before unwrap
self._do_read_into_void(context) # <--- I am so sad because I am buggy
try:
self._sslobj.unwrap()
except ssl_SSLAgainErrors as exc:
self._process_outgoing()
else:
self._process_outgoing()
if not self._get_write_buffer_size():
self._on_shutdown_complete(None)
except Exception as ex:
self._on_shutdown_complete(ex)
Being commented out, uvloop
shows 100% the same behavior as vanilla asyncio in CPython 3.11 (closing the connection as expected and logging the fatal error):
DEBUG:asyncio:<uvloop.loop.SSLProtocol object at 0x7f9cbfbcf5e0>: Error occurred during shutdown
Traceback (most recent call last):
File "uvloop/sslproto.pyx", line 624, in uvloop.loop.SSLProtocol._do_shutdown
File "/usr/lib/python3.11/ssl.py", line 983, in unwrap
return self._sslobj.shutdown()
^^^^^^^^^^^^^^^^^^^^^^^
ssl.SSLError: [SSL: APPLICATION_DATA_AFTER_CLOSE_NOTIFY] application data after close notify (_ssl.c:2672)
You may also inspect the source code of both uvloop and asyncio yourself. Vanilla asyncio in CPython 3.11 indeed does NOT read before unwrap in the shutdown stage.
def _do_flush(self):
self._do_read()
self._set_state(SSLProtocolState.SHUTDOWN)
self._do_shutdown()
def _do_shutdown(self):
try:
if not self._eof_received:
self._sslobj.unwrap()
except SSLAgainErrors:
self._process_outgoing()
except ssl.SSLError as exc:
self._on_shutdown_complete(exc)
else:
self._process_outgoing()
self._call_eof_received()
self._on_shutdown_complete(None)
https://github.com/python/cpython/blob/3.11/Lib/asyncio/sslproto.py#L643
CC
The issue has been located: https://github.com/MagicStack/uvloop/issues/471#issuecomment-1247584288 https://github.com/MagicStack/uvloop/issues/471#issuecomment-1249170910 https://github.com/MagicStack/uvloop/issues/471#issuecomment-1264470819
However, more discussion is needed since the issue is actually caused by a widespread implementation practice violating the specification: https://github.com/MagicStack/uvloop/issues/471#issuecomment-1264648224