kopf
kopf copied to clipboard
Kopf Crashes with "Connection broken: IncompleteRead(0 bytes read)"
Expected Behavior
Simple handler should not crash
Actual Behavior
Kopf crashes after a couple of minutes
`
Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 639, in _update_chunk_length self.chunk_left = int(line, 16) ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 397, in _error_catcher yield File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 704, in read_chunked self._update_chunk_length() File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 643, in _update_chunk_length raise httplib.IncompleteRead(line) http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/requests/models.py", line 750, in generate for chunk in self.raw.stream(chunk_size, decode_content=True): File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 527, in stream for line in self.read_chunked(amt, decode_content=decode_content): File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 732, in read_chunked self._original_response.close() File "/usr/local/lib/python3.7/contextlib.py", line 130, in exit self.gen.throw(type, value, traceback) File "/usr/local/lib/python3.7/site-packages/urllib3/response.py", line 415, in _error_catcher raise ProtocolError('Connection broken: %r' % e, e) urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/kopf", line 10, in
`
Steps to Reproduce the Problem
Start Kopf with a following handler:
` import kopf import yaml import os
@kopf.on.event('', 'v1', 'pods', labels= {"type": "mongod"}) def pod_changed(logger, body, **kwargs): logger.info(f"Pod: %s", body['metadata']['name']) pass ` Kopf crashes in around 5 minutes
Specifications
-
Platform: Docker container: python:3.7.3-alpine3.9
-
Kubernetes version: (use
kubectl version
)
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
- Python version: (use
python --version
)
Python 3.7.3
- Python packages installed: (use
pip freeze --all
) aiohttp==3.5.4 aiojobs==0.2.2 async-timeout==3.0.1 attrs==19.1.0 cachetools==3.1.1 certifi==2019.6.16 chardet==3.0.4 Click==7.0 google-auth==1.6.3 idna==2.8 iso8601==0.1.12 kopf==0.20 kubernetes==10.0.0 multidict==4.5.2 oauthlib==3.0.2 pip==19.1.1 pyasn1==0.4.6 pyasn1-modules==0.2.6 pykube-ng==0.28 python-dateutil==2.8.0 PyYAML==5.1.2 requests==2.22.0 requests-oauthlib==1.2.0 rsa==4.0 setuptools==41.0.1 six==1.12.0 urllib3==1.25.3 websocket-client==0.56.0 wheel==0.33.3 yarl==1.3.0
@chilicat Thanks for reporting.
Can you please make an experiment in your environment: if you put this line on top of your script, does the delayed error happen exactly by that specified time (in seconds)? If you set it to 600 seconds (10 mins), does it still happen at ~5 mins?
import kopf
kopf.config.WatchersConfig.default_stream_timeout = 60
@kopf.on.event(...)
...
The process does not crash anymore (~50 minutes, still running)
@chilicat Thanks. So, let it be a workaround for now (despite that kopf.config...
is undocumented and internal). Please, wrap it with try-except
— in case this module/class/attribute is renamed/removed in the future.
I saw this issue few times — with sporadic server-side disconnections when ?timeout=...
query arg is not specified. It goes deep into K8s API implementation and Python's internals: kopf→pykube→requests→urlib3→http→socket
.
I would prefer to not fix the sync i/o issues in this async app anymore (too many, too hard), and would better replace all of this with aiohttp
as the core of Kopf's i/o (coming soon) — and then fix the connection issues there (if they happen).
So, let's keep this issue open until then — so that the issue is not forgotten, and a fix is added.
@nolar Sure, no problem. Thanks for the fast feedback.
I am experiencing the same issue with my PoC of operator for Vertica cluster DB. Luckily the workaround works for me as well. Subscribing.
@nolar while playing with my operator, the error started to occur more and more often, it is almost impossible to continue developing it. Unfortunately the workaround stopped working. Is delivery of the replacement with aiohttp already planned? Is there any other way, how to workaround the issue?
kopf==0.23rc1
is now pre-released (see the release notes). It is now fully aiohttp-based, and contains no synchronous API calls. Which means, the whole I/O machinery is changed. Which means, the described issue is either completely gone, or will look differently.
@chilicat @jaceksan Please, give this release candidate a try — is the reported issue gone (with a workaround removed temporarily)? I could not reproduce it in any of my environments.