kubernetes_asyncio icon indicating copy to clipboard operation
kubernetes_asyncio copied to clipboard

Getting error for "Not enough data for satisfy transfer length header." using using Watch()

Open morkalfon opened this issue 10 months ago • 4 comments

Description of the problem

When using the watch.Watch().stream() function to listen for Kubernetes events, I encounter the following exception:

aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>

As a workaround, I've implemented a retry mechanism with backoff.

I didn't specified any timeout_seconds or _request_timeout settings (reference). I want the stream to run infinitely.

Expected Behavior

  • The event stream should run continuously without encountering ClientPayloadError.
  • If the connection is interrupted, it should be handled gracefully without requiring retries.

Actual Behavior

  • The stream occasionally throws ClientPayloadError with a TransferEncodingError (400).
  • It seems to be caused by incomplete response payloads from the Kubernetes API.

Code

import asyncio
from kubernetes_asyncio import client, watch
...

async def listener(namespace: str) -> None:
    watcher = None
    v1 = None

    while True:
        try:
            watcher = watch.Watch()
            v1 = client.CoreV1Api()

            function = v1.list_namespaced_config_map if namespace else v1.list_config_map_for_all_namespaces
            args = {"namespace": namespace} if namespace else {}

            response = await function(**args)
            resource_version = response.metadata.resource_version

            async for event in watcher.stream(
                function,
                resource_version=resource_version,
                **args
            ):
                ...
        except Exception:
            logger.exception("Exception occurred while listening for events")
            delay = 3 + random.uniform(0, 1)
            logger.info(f"Retrying in {delay:.2f} seconds")
            await asyncio.sleep(delay)
        finally:
            logger.exception("Events stream interrupted. Closing connections")
            if watcher:
                watcher.stop()
            if v1:
                await v1.api_client.rest_client.pool_manager.close()

Environment

  • kubernetes_asyncio version: 32.0.0
  • Python version: 3.12
  • Kubernetes version: 1.31
  1. Is this issue related to how Kubernetes API servers handle chunked responses?
  2. Could this be mitigated by adjusting timeout_seconds, even though the goal is an indefinite stream?
  3. Any recommendations on handling this error gracefully without frequent retries?

Would appreciate any insights on whether this is a known issue or if there's a recommended approach to prevent these exceptions.

morkalfon avatar Mar 02 '25 07:03 morkalfon

Thanks for reporting this issue.

I don't know why you get this errors, so your current workaround with retrying looks good to me.

Could you tell us what kind of cluster do you have? on-prem, google cloud or aws etc?

How frequently do you get this error? Like every 5 minutes? Does it happen if there is no events to watch too? Or maybe it works well if there are lots of events to watch?

tomplus avatar Mar 02 '25 20:03 tomplus

The cluster is based on Akamai Cloud and Linode LKE.

I see this error occurring frequently but not on fixed times.

morkalfon avatar Mar 02 '25 21:03 morkalfon

I am also seeing this after upgrading from k8s 1.31 to 1.32, running on GKE. Seems to happen about every 5 minutes.

2025-05-16 14:18:58,911 - k8s - ERROR - Uncaught Exception in list_service_for_all_namespaces watch: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
2025-05-16 14:24:08,949 - k8s - ERROR - Uncaught Exception in list_service_for_all_namespaces watch: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
2025-05-16 14:29:19,034 - k8s - ERROR - Uncaught Exception in list_service_for_all_namespaces watch: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
2025-05-16 14:34:29,077 - k8s - ERROR - Uncaught Exception in list_service_for_all_namespaces watch: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
2025-05-16 14:39:39,111 - k8s - ERROR - Uncaught Exception in list_service_for_all_namespaces watch: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>
2025-05-16 14:44:49,150 - k8s - ERROR - Uncaught Exception in list_service_for_all_namespaces watch: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>

nosammai avatar May 16 '25 14:05 nosammai

We're also running into this issue now for jobs that run longer than 30 minutes, when we try to read_namespaced_pod_log. We have yet to try setting timeout_seconds or active_deadline_seconds in the job_spec_properties. Ideally we'd avoid implementing custom retry logic.

Example traceback:

[2025-06-03T02:07:46.556Z]   File "/usr/local/lib/python3.12/site-packages/aiohttp/client_proto.py", line 93, in connection_lost
[2025-06-03T02:07:46.556Z]     uncompleted = self._parser.feed_eof()
[2025-06-03T02:07:46.556Z]                   ^^^^^^^^^^^^^^^^^^^^^^^
[2025-06-03T02:07:46.556Z]   File "aiohttp/_http_parser.pyx", line 508, in aiohttp._http_parser.HttpParser.feed_eof
[2025-06-03T02:07:46.556Z] aiohttp.http_exceptions.TransferEncodingError: 400, message:
[2025-06-03T02:07:46.556Z]   Not enough data for satisfy transfer length header.
[2025-06-03T02:07:46.556Z] 
[2025-06-03T02:07:46.556Z] The above exception was the direct cause of the following exception:
[2025-06-03T02:07:46.556Z] 
[2025-06-03T02:07:46.556Z] Traceback (most recent call last):
[2025-06-03T02:07:46.556Z]   File "/app/shared/k8s_connection_async.py", line 2050, in _run_job_for_instance
[2025-06-03T02:07:46.556Z]     await self._wait_for_job_to_complete(
[2025-06-03T02:07:46.556Z]   File "/app/shared/k8s_connection_async.py", line 80, in wrapper
[2025-06-03T02:07:46.556Z]     result = await func(*args, **kwargs)
[2025-06-03T02:07:46.556Z]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-06-03T02:07:46.556Z]   File "/app/shared/k8s_connection_async.py", line 1984, in _wait_for_job_to_complete
[2025-06-03T02:07:46.556Z]     async for event in stream:
[2025-06-03T02:07:46.556Z]   File "/usr/local/lib/python3.12/site-packages/kubernetes_asyncio/watch/watch.py", line 135, in __anext__
[2025-06-03T02:07:46.556Z]     return await self.next()
[2025-06-03T02:07:46.556Z]            ^^^^^^^^^^^^^^^^^
[2025-06-03T02:07:46.556Z]   File "/usr/local/lib/python3.12/site-packages/kubernetes_asyncio/watch/watch.py", line 165, in next
[2025-06-03T02:07:46.556Z]     line = await self.resp.content.readline()
[2025-06-03T02:07:46.556Z]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-06-03T02:07:46.556Z]   File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 352, in readline
[2025-06-03T02:07:46.556Z]     return await self.readuntil()
[2025-06-03T02:07:46.556Z]            ^^^^^^^^^^^^^^^^^^^^^^
[2025-06-03T02:07:46.556Z]   File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 386, in readuntil
[2025-06-03T02:07:46.556Z]     await self._wait("readuntil")
[2025-06-03T02:07:46.556Z]   File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 347, in _wait
[2025-06-03T02:07:46.556Z]     await waiter
[2025-06-03T02:07:46.556Z] aiohttp.client_exceptions.ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data for satisfy transfer length header.'>

kieran-4g avatar Jun 03 '25 15:06 kieran-4g