kubernetes_asyncio
kubernetes_asyncio copied to clipboard
ClientPayloadError occasionally thrown when iterating a watch
i do get a ClientPayloadError from time to time:
File ... in run
async for update in stream:
File "/usr/local/lib/python3.8/dist-packages/kubernetes_asyncio/watch/watch.py", line 131, in __anext__
return await self.next()
File "/usr/local/lib/python3.8/dist-packages/kubernetes_asyncio/watch/watch.py", line 152, in next
line = await self.resp.content.readline()
File "/usr/local/lib/python3.8/dist-packages/aiohttp/streams.py", line 338, in readline
await self._wait("readline")
File "/usr/local/lib/python3.8/dist-packages/aiohttp/streams.py", line 306, in _wait
await waiter
aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed
cheers
Is it self-hosted cluster or provided by Google/AWS/Azure? Could you check if kubectl
works longer without the issue?
this is on a self-hosted microk8s cluster. are you suggesting i try running the same watch with kubectl
? i didn't know kubectl could perform watches. could you point me to some docs if so?
with thanks
There is a flag --watch
to watch for changes, for example:
`kubectl get pods --watch`
righto, i'll take a look. regardless, should the aiohttp.client_exceptions.ClientPayloadError
be wrapped? (i'm not sure, i can see arguments both ways)
How long is it working without raising the exception?
I'm also not sure what to do... we can treat it as "timeout" and silently reconnect but on the other hand such behavior may hide some errors in other case.
Could you check if kubectl works longer without the issue?
i haven't tested exhaustively, but it doesn't look like kubectl lasts longer.
How long is it working without raising the exception?
it works for quite some time. i think this might typically result in about 4 pod restarts in 24 hours.
I'm also not sure what to do... we can treat it as "timeout" and silently reconnect but on the other hand such behavior may hide some errors in other case.
yeah. ideally we can clear tell between exceptions we expect, and those we don't. something like:
while True:
try:
async with watch.stream(self._api.list_namespaced_pod, self._namespace) as stream:
async for update in stream:
except ExceptionsWeExpect:
pass
except ExceptionsWeDontExpect:
raise
with thanks
We are seeing this issue in an Azure kubernetes 1.22 cluster. Whenever we watch anything and there is no activity for the 5 minute default kubernetes timeout (even if prior events have been received), we see this error.
I am not familiar (enough) with the underpinnings of how kubernetes does watch calls, but I could run a kubectl get pods --watch
on the same cluster and it worked past the 5 minute timeout.