aiohttp
aiohttp copied to clipboard
Occasional errors `ClientPayloadError: Response payload is not completed` in client requests
Hi,
I'm noticing errors ClientPayloadError: Response payload is not completed
with some urls. You can request the same url multiple times and sometimes it fails.
Might be related to https://github.com/aio-libs/aiohttp/issues/2954
Please see the traceback below:
In [20]: r = await s.get('http://rayih.com/podcast/0001.xml')
In [21]: await r.read()
---------------------------------------------------------------------------
ClientPayloadError Traceback (most recent call last)
<ipython-input-21-39c3a552a171> in async-def-wrapper()
***/lib/python3.7/site-packages/aiohttp/client_reqrep.py in read(self)
971 if self._body is None:
972 try:
--> 973 self._body = await self.content.read()
974 for trace in self._traces:
975 await trace.send_response_chunk_received(self._body)
***/lib/python3.7/site-packages/aiohttp/streams.py in read(self, n)
356 blocks = []
357 while True:
--> 358 block = await self.readany()
359 if not block:
360 break
***/lib/python3.7/site-packages/aiohttp/streams.py in readany(self)
378 # without feeding any data
379 while not self._buffer and not self._eof:
--> 380 await self._wait('readany')
381
382 return self._read_nowait(-1)
***/lib/python3.7/site-packages/aiohttp/streams.py in _wait(self, func_name)
294 if self._timer:
295 with self._timer:
--> 296 await waiter
297 else:
298 await waiter
ClientPayloadError: Response payload is not completed
Steps to reproduce
Please run this script several times. In my case it fails 1 out of 10. curl 7.54.0 and Firefox 71.0b12 work fine and stable.
import aiohttp
async with aiohttp.ClientSession as session:
async with session.get('http://rayih.com/podcast/0001.xml') as response:
await response.read()
Your environment
client aiohttp 3.6.2, python 3.7.5, OS X 10.12.6 (Sierra)
The issue still persists as of 3.8.1. It happens randomly when downloading large files. Is there possibly any workaround for this?
Any follow ups on this one?
Also looking for a follow-up to this. Attempting to get large numbers of RSS feeds with asyncio results in this exception for me.
This part of the client is pretty much unusable because of this bug... No luck with the socket keepalive hack either.
Hey everyone, I just released v3.9.4rc0. It'll hopefully help us get more complete tracebacks and that in turn, will let us know the underlying causes for this exception being raised — there may be several distinct behaviors underneath and we need to discover what they are.
If you have ability to do so, please try out this pre-release in environments where you can reproduce your problems and come back with the logs with complete tracebacks. Let's figure this out together!
@webknjaz - I installed rc0. For proper context, I'm using aiohttp indirectly through the third-party package aiosseclient
(as aio-libs' own aiohttp-sse
doesn't appear to have a Client), but given the traceback, I'm not sure that's super-relevant; SSE can have multiple payloads with just line endings, and I'm wondering if that's related here (and the linked issue from the original comment).
Please do let me know if there's anything I can do with rc0 to assist.
Traceback (most recent call last):
File "[...]/aiohttp/client_proto.py", line 94, in connection_lost
uncompleted = self._parser.feed_eof()
File "aiohttp/_http_parser.pyx", line 507, in aiohttp._http_parser.HttpParser.feed_eof
aiohttp.http_exceptions.TransferEncodingError: 400, message:
Not enough data for satisfy transfer length header.
The above exception was the direct cause of the following exception:
[snip asyncio setup]
File "sse.py", line 10, in main
async for event in aiosseclient('https://stream.wikimedia.org/v2/stream/page-undelete'):
File "[...]/aiosseclient.py", line 128, in aiosseclient
async for line in response.content:
File "[...]/aiohttp/streams.py", line 50, in __anext__
rv = await self.read_func()
File "[...]/aiohttp/streams.py", line 317, in readline
return await self.readuntil()
File "[...]/aiohttp/streams.py", line 351, in readuntil
await self._wait("readuntil")
File "[...]/aiohttp/streams.py", line 312, in _wait
await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
https://github.com/ebraminio/aiosseclient/blob/main/aiosseclient.py#L128
Wikimedia is a popular platform to test SSE streams, you can use one that doesn't get a lot of traffic like https://stream.wikimedia.org/v2/stream/page-undelete
to replicate; it'll fail in about 60s.
import asyncio
import aiohttp
from aiosseclient import aiosseclient # pip install aiosseclient
async def main():
async for event in aiosseclient('https://stream.wikimedia.org/v2/stream/recentchange'):
print(event)
if __name__ == '__main__':
asyncio.run(main())
@DefiDebauchery it does look relevant at glance, thanks! I saw several issues with that client error exception but no context. And I'm pretty sure there are several different underlying causes. It's useful to be able to collect them and see if some aren't handled properly on the aiohttp side. aiohttp should only raise errors for misbehaving servers and other legit problems, obviously.
Replicating these problems in isolated environments would be even better as relying on third party services is not exactly easy to debug, especially with an added layer of TLS in traffic captures.
Wikimedia is a popular platform to test SSE streams, you can use one that doesn't get a lot of traffic like
https://stream.wikimedia.org/v2/stream/page-undelete
to replicate; it'll fail in about 60s.
Ran it for 5 mins with no exceptions. You'll need to provide a better reproducer.
Nevermind, the page-undelete fails, but the recentchanges in the code block works fine.
I don't think the error is wrong, the connection is getting dropped after 60 seconds of inactivity. It seems like it's an abrupt disconnect, so the content of the message is not completed.
So, that SSE one looks to me like you either need to handle the error and retry, or there is something with the aiosseclient that should be doing the retry (or having some kind of ping to keep the connection open maybe).
The original reproducer at the top doesn't result in any exceptions when I test now (and the issue is from over 4 years ago).
So, I'd suggest we close this one as I don't think there's anything wrong here.
@Dreamsorcerer shouldn't it cause a more specific ServerDisconnectedError
? ClientPayloadError
implies a problem with the content received over a working connection, not a connection interruption, in my opinion.
Not to mention that streaming connections (including SSE) aren't intended to be closed after a short time of inactivity; that would be the job of the ClientTimeout
configuration (or the server, but I don't think it's the server giving up)
@DefiDebauchery have you tried running a tcpdump
there to observe what's happening?
In response to both of you, see the first line of the traceback:
File "[...]/aiohttp/client_proto.py", line 94, in connection_lost
As far as I can see, the disconnect is not triggered by us and no further data is sent from the server (that function is generally called from asyncio internals after the connection is dropped). So, when it abruptly disconnects, our code tries to feed_eof() to close out the message, which results in that error.
We could maybe figure out some improvement so it does produce a ServerDisconnectedError without a parsing error, but that won't make much difference in making that application work correctly (i.e. it'd just be catching a different exception).
our code tries to feed_eof() to close out the message, which results in that error.
Maybe we can tweak feed_eof() in the connection_lost() case, so it checks whether the connection is 'chunked' and if so, it feeds a closing chunk (0\r\n\r\n
I think) instead.
No, forget that. feed_eof() is literally just checking if the message is complete, and it's not. So, again, I think the error is correct.
It appears that the server accepts the connection, sends the headers to say it's a chunked message, waits 60 seconds, then closes the connection without sending a zero-chunked message to indicate the message is complete.
ServerDisconnectedError happens after that. If the payload isn't incomplete, but the protocol hasn't received EOF (I'm not entirely clear what the difference is...) then it throws ServerDisconnectedError.
So this is the output I get when using my above script through strace
strace -f -e trace=network -s 10000 python3 sse.py
Unfortunately, I'm not super well-versed in such low level connection data to make sense of this, but adding in case someone is.
[Setup packets]
[-- ~60s of nothing --]
[pid 184817] recvfrom(6, "\27\3\3\0\23\24\357\215}\272\270\303\3S\225\225j\4\33n}\375\244\6", 262144, 0, NULL, NULL) = 24
strace: Process 184823 attached
[pid 184818] +++ exited with 0 +++
[pid 184823] sendto(5, "\0", 1, 0, NULL, 0) = 1
[pid 184823] +++ exited with 0 +++
recvfrom(4, "\0", 4096, 0, NULL, NULL) = 1
recvfrom(4, 0x55b77e5e8270, 4096, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable)
Traceback (most recent call last):
File "[]/aiohttp/client_proto.py", line 94, in connection_lost
uncompleted = self._parser.feed_eof()
File "aiohttp/_http_parser.pyx", line 507, in aiohttp._http_parser.HttpParser.feed_eof
aiohttp.http_exceptions.TransferEncodingError: 400, message:
Not enough data for satisfy transfer length header.
We're seeing this on AWS after downloading a load of objects off of S3. It reproduces very infrequently and unreliably there. We've worked around it with some additional retry logic.
The SSE script posted above by @DefiDebauchery is a perfect reproducer. It always happens after exactly 15 minutes. Looks like it is the server which is closing the connection (15 minutes of events? no more soup for you!). I confirmed with some debug tracing that the initiator of the crash is the socket being closed by the peer (going into _SelectorSocketTransport::connection_lost
and associated machinery and then ending up in one or other of these two legs of this if-statement in HttpPayloadParser::feed_eof()
(aiohttp/http_parser.py
):
elif self._type == ParseState.PARSE_LENGTH:
raise ContentLengthError(
"Not enough data for satisfy content length header."
)
elif self._type == ParseState.PARSE_CHUNKED:
raise TransferEncodingError(
"Not enough data for satisfy transfer length header."
)
and tcpdump
confirms peer closes connection:
10:19:08.938320 IP 103.102.166.224.https > 192.168.14.143.35994: Flags [P.], seq 39185783:39187076, ack 811, win 64240, length 1293
10:19:08.938784 IP 192.168.14.143.35994 > 103.102.166.224.https: Flags [.], ack 39187076, win 65535, length 0
10:19:08.998630 IP 103.102.166.224.https > 192.168.14.143.35994: Flags [FP.], seq 39187076:39187100, ack 811, win 64240, length 24
10:19:08.999275 IP 192.168.14.143.35994 > 103.102.166.224.https: Flags [P.], seq 811:835, ack 39187101, win 65535, length 24
10:19:08.999821 IP 192.168.14.143.35994 > 103.102.166.224.https: Flags [F.], seq 835, ack 39187101, win 65535, length 0
10:19:08.999939 IP 103.102.166.224.https > 192.168.14.143.35994: Flags [.], ack 835, win 64240, length 0
10:19:09.000269 IP 103.102.166.224.https > 192.168.14.143.35994: Flags [.], ack 836, win 64239, length 0
With a bit of tracing I determined that, upon entry to HttpPayloadParser::feed_eof()
, we have:
self._type == ParseState.PARSE_CHUNKED
isinstance(self.payload, aiohttp.streams.StreamReader)
str(self.payload) == '<StreamReader w=<Future pending cb=[Task.task_wakeup()]>>'
self.payload._size == 0
self.payload.total_bytes == 37914638 # (or whatever)
self.payload.buffer == deque([])
I think this safely confirms the theory that this is the server closing the connection without a zero sized chunk, so I think @Dreamsorcerer is correct on this one.