kopf
kopf copied to clipboard
timeout errors are logged at exception level
Expected Behavior
no traces in logs
Actual Behavior
from time to time we see the following in the controller logs:
[2020-01-02 11:39:42,792] kopf.reactor.queuein [ERROR ] functools.partial(<function resource_handler at 0x7fcd14aec950>, lifecycle=<function asap at 0x7fcd14add9e0>, registry=<kopf.toolkits.legacy_registries.SmartGlobalRegistry object at 0x7fcd1423a0d0>, memories=<kopf.structs.containers.ResourceMemories object at 0x7fcd1488ec90>, resource=Resource(group='', version='v1', plural='secrets'), event_queue=<Queue at 0x7fcd14322d50 maxsize=0 _getters[1] tasks=217>) failed with an exception. Ignoring the event.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/kopf/reactor/queueing.py", line 179, in worker
await handler(event=event, replenished=replenished)
File "/usr/local/lib/python3.7/site-packages/kopf/reactor/handling.py", line 224, in resource_handler
await patching.patch_obj(resource=resource, patch=patch, body=body)
File "/usr/local/lib/python3.7/site-packages/kopf/clients/auth.py", line 46, in wrapper
return await fn(*args, **kwargs, session=session)
File "/usr/local/lib/python3.7/site-packages/kopf/clients/patching.py", line 54, in patch_obj
raise_for_status=True,
File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 504, in _request
await resp.start(conn)
File "/usr/local/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 860, in start
self._continue = None
File "/usr/local/lib/python3.7/site-packages/aiohttp/helpers.py", line 596, in __exit__
raise asyncio.TimeoutError from None
concurrent.futures._base.TimeoutError
It seems it would be enough to handle asyncio.TimeoutError
here
https://github.com/zalando-incubator/kopf/blob/d74a0638da01037d65e8b6c7b041c4f9c2e257cd/kopf/reactor/queueing.py#L178-L182
similar to how it is handled separately several times in the code before that,
and log only an error, not an exception (or may be even just a warning)
Steps to Reproduce the Problem
unfortunately can't really pin it down, may be just a performance issue (currently seen on our internal CI that has a lot crammed into several VMs with k8s).
Specifications
- Platform: Ubuntu 18.04
- Kubernetes version: 1.15
- Python version: 3.7.3
- Python packages installed: aiohttp==3.6.2,aiojobs==0.2.2,async-timeout==3.0.1,attrs==19.3.0,certifi==2019.11.28,chardet==3.0.4,Click==7.0,idna==2.8,iso8601==0.1.12,kopf==0.24,multidict==4.7.2,pykube-ng==19.12.0,PyYAML ==5.2,requests==2.22.0,typing-extensions==3.7.4.1,urllib3==1.25.7,yarl==1.4.2