aiohttp
aiohttp copied to clipboard
Received "Response payload is not completed" when reading response
🐞 Describe the bug
A ClientPayloadError: Response payload is not completed exception is raised when reading the response of a GET request from the GitHub REST API (this should not be especially relevant). The response seems correct and using curl succeeds all the time.
💡 To Reproduce
Do lots of requests against an endpoint of the GitHub API (it happens from time to time).
💡 Expected behavior The response is correctly parsed.
📋 Logs/tracebacks
File "/opt/lib/python3.8/site-packages/gidgethub/aiohttp.py", line 20, in _request
return response.status, response.headers, await response.read()
File "/opt/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 973, in read
self._body = await self.content.read()
File "/opt/lib/python3.8/site-packages/aiohttp/streams.py", line 358, in read
block = await self.readany()
File "/opt/lib/python3.8/site-packages/aiohttp/streams.py", line 380, in readany
await self._wait('readany')
File "/opt/lib/python3.8/site-packages/aiohttp/streams.py", line 296, in _wait
await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
Some contextual information at the time of the exception at aiohttp/streams.py", line 358, in read:
blocks = [b'[\n {\n "id": 2941521,\n "node_id": "MDU6TGFiZWwyOTQxNTIx",\n "url": "https://REDACTED_GITHUB_URL/repos/testorg/test-debian/labels/skip-issue",\n "name": "skip-issue",\n "color": "000000",\n "default": false\n }\n]\n']
n=-1
As you can see the blocks contain the whole payload (the full JSON) but aiohttp still complains about the payload not being completed.
📋 Your version of the Python
$ python --version
Python 3.8.1
📋 Your version of the aiohttp/yarl/multidict distributions
$ python -m pip show aiohttp
Name: aiohttp
Version: 3.6.2
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: Nikolay Kim
Author-email: [email protected]
License: Apache 2
Location: .../versions/3.8-dev/envs/medusa/lib/python3.8/site-packages
Requires: async-timeout, attrs, chardet, multidict, yarl
Required-by: raven-aiohttp, pytest-aiohttp, aiojobs
$ python -m pip show multidict
Name: multidict
Version: 4.7.3
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: .../versions/3.8-dev/envs/medusa/lib/python3.8/site-packages
Requires:
Required-by: yarl, aiohttp
$ python -m pip show yarl
Name: yarl
Version: 1.4.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /Users/pgalindo3/.pyenv/versions/3.8-dev/envs/medusa/lib/python3.8/site-packages
Requires: idna, multidict
Required-by: aiohttp
📋 Additional context
I suspect this has to be with this comment:
https://github.com/aio-libs/aiohttp/blob/1740c99159176344b02006446b4302d0f92f9ac2/aiohttp/streams.py#L352-L356
as this is where is failing and the situation seems that is the one described in the TODO.
CC: @socketpair @asvetlov as the TODO was added in https://github.com/aio-libs/aiohttp/commit/5c4cb823b35fb452c67fb8070c417df5980c2fd8
I'm curious if it's somehow related to #4549.
I'm curious if it's somehow related to #4549.
I encountered these two problems at the same time. #4549 #4581 Python: 3.8.4 aiohttp: 3.6.2
I think there do have some relaionship. And I think the core conditions are frequent visits to a single domain name.
CC: @socketpair @asvetlov as the TODO was added in https://github.com/aio-libs/aiohttp/commit/5c4cb823b35fb452c67fb8070c417df5980c2fd8
Change the while to if seems no help.
Hi there, I've received this error too. My use case is a reverse-proxy in front of gunicorn. The other reports of this instance on this repo appear to be poorly behaved HTTP servers, which I think is a separate issue.
My code streams the proxy body back to upstream, using code roughly like:
response = web.StreamingResponse(headers=response_headers, status=status)
async for chunk in gunicorn_response.content.iter_chunked(64 * 1_024):
await response.write(chunk)
await response.write_eof()
return response
What seems to be happening is that gunicorn returns a Connection: close header and then there is a race condition between aiohttp reading the body and the connection being closed. If I get the data out of aiohttp in time, it works, but sometimes this stack trace is triggered.
Investigation:
- Calling
await gunicorn_response.content.wait_eof()prior to callingiter_chunkedreduces the instance of the error, but it still happens occasionally. - Setting the
Connection: keep-aliveheader in the reverse proxy's request to gunicorn resolves this issue. This is an acceptable workaround for me. - Changing the
whileto anifin the source code in conjunction with callingwait_eof()prior to iterating over the body resolves the issue.
we have this happening between an aiohttp 3.8.1 client + server in production, trying to reproduce
I've just experienced the same issue via s3fs, downloading a few hundred files from S3.
Python: 3.7.12 aiohttp: 3.8.1 yarl: 1.7.2 multidict: 5.2.0
Traceback (most recent call last):
File "dataset_io.py", line 336, in download
filesystem.get(self._location, self._download_location, recursive=True)
File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 91, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 71, in sync
raise return_result
File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 529, in _get
coros, batch_size=batch_size, callback=callback
File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 249, in _run_coros_in_chunks
await asyncio.gather(*chunk, return_exceptions=return_exceptions),
File "/root/.pyenv/versions/3.7.12/lib/python3.7/asyncio/tasks.py", line 414, in wait_for
return await fut
File "/.venv/lib/python3.7/site-packages/s3fs/core.py", line 1002, in _get_file
chunk = await body.read(2 ** 16)
File "/.venv/lib/python3.7/site-packages/aiobotocore/response.py", line 53, in read
chunk = await self.__wrapped__.read(amt if amt is not None else -1)
File "/.venv/lib/python3.7/site-packages/aiohttp/streams.py", line 385, in read
await self._wait("read")
File "/.venv/lib/python3.7/site-packages/aiohttp/streams.py", line 304, in _wait
await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
Sorry I don't have much more information here, other than the fact that this is occurring on Python 3.7, whereas #4549 has been described as occurring on Python 3.8 only.
I got this error on Python 3.9 with aiohttp 3.8.1 on Windows. I haven't been able to reproduce.
My server failed around the same time as the error. The backoff library re-ran the request, and I got the ClientPayloadError. Is this expected? Shouldn't losing the server raise ConnectionError instead?
I have the same issue with Python 3.10
Same error here!
I've just experienced the same issue via s3fs, downloading a few hundred files from S3.
Python: 3.7.12 aiohttp: 3.8.1 yarl: 1.7.2 multidict: 5.2.0
Traceback (most recent call last): File "dataset_io.py", line 336, in download filesystem.get(self._location, self._download_location, recursive=True) File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 91, in wrapper return sync(self.loop, func, *args, **kwargs) File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 71, in sync raise return_result File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 25, in _runner result[0] = await coro File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 529, in _get coros, batch_size=batch_size, callback=callback File "/.venv/lib/python3.7/site-packages/fsspec/asyn.py", line 249, in _run_coros_in_chunks await asyncio.gather(*chunk, return_exceptions=return_exceptions), File "/root/.pyenv/versions/3.7.12/lib/python3.7/asyncio/tasks.py", line 414, in wait_for return await fut File "/.venv/lib/python3.7/site-packages/s3fs/core.py", line 1002, in _get_file chunk = await body.read(2 ** 16) File "/.venv/lib/python3.7/site-packages/aiobotocore/response.py", line 53, in read chunk = await self.__wrapped__.read(amt if amt is not None else -1) File "/.venv/lib/python3.7/site-packages/aiohttp/streams.py", line 385, in read await self._wait("read") File "/.venv/lib/python3.7/site-packages/aiohttp/streams.py", line 304, in _wait await waiter aiohttp.client_exceptions.ClientPayloadError: Response payload is not completedSorry I don't have much more information here, other than the fact that this is occurring on Python 3.7, whereas #4549 has been described as occurring on Python 3.8 only.
I am also getting that error using iter_chunked when downloading files from S3
Every time I download the last few files, this error message appears
Hi, is there any fix for this? Appreciate it.
My services occasionally crash with this error, Python 3.10.12 and Name: aiohttp Version: 3.8.5.
What are our obstacles in isolating this issue, before we can fix it?
I'll ensure I hold on to the logs the next time it arises, do we know what's causing it?
Thanks!
What are our obstacles in isolating this issue, before we can fix it?
Not sure. Maybe we can figure out a message that causes the problem, or maybe it needs multiple steps to trigger.. It looks like it might be related to #5238, so maybe we can get that PR updated and merged.
I'll try that PR out, will report if something breaks
@Dreamsorcerer @TheOnlyWayUp Did you guys seem to figure this out? May be a session setting issue wrt stale connections. I am running into this issue with langchainjs which uses openAI's python client.
All discussion is above, so feel free to follow up. PR #5238 could be worked on to get it merged, or if you can figure out a test (that runs in our CI) to reproduce the issue, then that would be a great help.
I started to have this issue with 3.8.6 on pypy 3.9.18 consistently on CI, 80% of the time on my local box for a very specific streaming unittest (so I can trigger it quite easily with a simple pytest run). It was working fine on 3.8.5.
Funny thing is that reading your ChangeLog, I saw the change for lenient parsing in llhttp and the -X dev mode that does the parsing differently, so out of ideas, I tried to run pytest with python -X dev -m pytest mytest, and suddenly the error is gone. If I remove again the -X dev, the error is back. Strangely, this works great on any other Python version.
Hope this helps somehow.
Stacktrace is consistent with everyone in this thread
../../../venvpypy/lib/pypy3.9/site-packages/aiohttp/client_reqrep.py:1043: in read
self._body = await self.content.read()
../../../venvpypy/lib/pypy3.9/site-packages/aiohttp/streams.py:375: in read
block = await self.readany()
../../../venvpypy/lib/pypy3.9/site-packages/aiohttp/streams.py:397: in readany
await self._wait("readany")
../../../venvpypy/lib/pypy3.9/site-packages/aiohttp/streams.py:304: in _wait
await waiter
/usr/lib/pypy3.9/asyncio/futures.py:284: in __await__
yield self # This tells Task to wait for completion.
/usr/lib/pypy3.9/asyncio/tasks.py:328: in __wakeup
future.result()
I tried to run pytest with
python -X dev -m pytest mytest, and suddenly the error is gone. If I remove again the-X dev, the error is back.
Hmm, that's surprising. I wonder if strict parsing is causing the connection to be get closed without raising an error somehow, thus forcing it to open a new connection...
I can confirm this on python 3.11/aiohttp 3.8.6
If I downgrade to 3.8.5 I don't have a problem if I do:
if (method == "GET):
await session.get(....)
if (method == "POST")
await session.post(...)
strangely, if I just do just session.request(method...), I get the error on 3.8.5 as well.
The possible fix (depending on what really is the root cause) is in 3.9.0b1, so please try to reproduce with that.
Nope, still failing.
Then someone will need to do some more debugging and/or figure out a reliable test that we can use to reproduce the behaviour.
Another data point: this is happening to our Octomachinery-based bots (Chronographer and Patchback), which are built on top of gidgethub+aiohttp, just like in the Pablo's traceback.
It's happening under CPython 3.9.13 on the quay.io/fedora/python-39:latest image with aiohttp 3.6.2. The bots are running Sentry, so it's possible to look a bit deeper into the tracebacks: https://github.com/sanitizers/octomachinery/issues/59.
Upon inspecting them, I noticed that they are related to GET and PATCH requests to the GitHub APIs. Nothing really useful.
Judging from the other hints in this issue, this might be reproducible in slow envs, like with high network latency or PyPy (which is often slower closer to its startup, before the JIT warms up).
Back to gidgethub, it uses aiohttp.ClientSession.request() — https://github.com/gidgethub/gidgethub/blob/7474cae/gidgethub/aiohttp.py#L19, which is in line with one of the comments above suggesting that calling dedicated methods like .get(), .patch() instead of .request() might yield a differing behavior.
Anyway, having Sentry in place, we could probably make some educated guesses and instrument my bots to log more things. If anybody has ideas, I'm all ears...
Comparing .get(), .patch() and .request(), I concluded that they are virtually equivalent and haven't changed in through versions. So your @rlippmann observation must be caused by something else. Also note that all of these methods return async context managers and should be used through async with and not await, for proper cleanup of the underlying response object.
So the exception is set here: https://github.com/aio-libs/aiohttp/blob/213d1b22d42ce8efc54d7858b490e920fcdb4f0a/aiohttp/client_proto.py#L99.
I wonder if one of these tests is a reproducer (when the exception processing is unsuppressed):
- https://github.com/aio-libs/aiohttp/blob/213d1b22d42ce8efc54d7858b490e920fcdb4f0a/tests/test_web_sendfile_functional.py#L951-L989
- https://github.com/aio-libs/aiohttp/blob/213d1b22d42ce8efc54d7858b490e920fcdb4f0a/tests/test_client_functional.py#L1897-L1913
- https://github.com/aio-libs/aiohttp/blob/213d1b22d42ce8efc54d7858b490e920fcdb4f0a/tests/test_web_server.py#L109-L130
So the exception is set here:
https://github.com/aio-libs/aiohttp/blob/213d1b22d42ce8efc54d7858b490e920fcdb4f0a/aiohttp/client_proto.py#L99
That code suppresses the actual exception. I think we need to atleast log the original exception, or ideally parent the original exception (like raise ... from ...) somehow.
That code suppresses the actual exception. I think we need to atleast log the original exception, or ideally parent the original exception (like
raise ... from ...) somehow.
Yes, I was also thinking that.
https://github.com/aio-libs/aiohttp/blob/213d1b22d42ce8efc54d7858b490e920fcdb4f0a/tests/test_web_sendfile_functional.py#L951-L989
When I injected a breakpoint here under the except block, and modified client_proto.py with the causing exception details, I extracted this when I ran the specific test with pytest -s:
(Pdb) pp traceback.print_exc()
Traceback (most recent call last):
File "~/src/github/aio-libs/aiohttp/aiohttp/client_proto.py", line 95, in connection_lost
uncompleted = self._parser.feed_eof()
^^^^^^^^^^^^^^^^^^^^^^^
File "aiohttp/_http_parser.pyx", line 510, in aiohttp._http_parser.HttpParser.feed_eof
raise ContentLengthError(
aiohttp.http_exceptions.ContentLengthError: 400, message:
Not enough data for satisfy content length header.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "~/src/github/aio-libs/aiohttp/tests/test_web_sendfile_functional.py", line 985, in test_static_file_huge_cancel
data += await resp.content.read(1024)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/src/github/aio-libs/aiohttp/aiohttp/streams.py", line 347, in read
raise self._exception
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <ContentLengthError: 400, message='Not enough data for satisfy content length header.'>
None