kopf icon indicating copy to clipboard operation
kopf copied to clipboard

timeout errors are logged at exception level

Open pshchelo opened this issue 5 years ago • 0 comments

Expected Behavior

no traces in logs

Actual Behavior

from time to time we see the following in the controller logs:

[2020-01-02 11:39:42,792] kopf.reactor.queuein [ERROR   ] functools.partial(<function resource_handler at 0x7fcd14aec950>, lifecycle=<function asap at 0x7fcd14add9e0>, registry=<kopf.toolkits.legacy_registries.SmartGlobalRegistry object at 0x7fcd1423a0d0>, memories=<kopf.structs.containers.ResourceMemories object at 0x7fcd1488ec90>, resource=Resource(group='', version='v1', plural='secrets'), event_queue=<Queue at 0x7fcd14322d50 maxsize=0 _getters[1] tasks=217>) failed with an exception. Ignoring the event.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/kopf/reactor/queueing.py", line 179, in worker
    await handler(event=event, replenished=replenished)
  File "/usr/local/lib/python3.7/site-packages/kopf/reactor/handling.py", line 224, in resource_handler
    await patching.patch_obj(resource=resource, patch=patch, body=body)
  File "/usr/local/lib/python3.7/site-packages/kopf/clients/auth.py", line 46, in wrapper
    return await fn(*args, **kwargs, session=session)
  File "/usr/local/lib/python3.7/site-packages/kopf/clients/patching.py", line 54, in patch_obj
    raise_for_status=True,
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 504, in _request
    await resp.start(conn)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 860, in start
    self._continue = None
  File "/usr/local/lib/python3.7/site-packages/aiohttp/helpers.py", line 596, in __exit__
    raise asyncio.TimeoutError from None
concurrent.futures._base.TimeoutError

It seems it would be enough to handle asyncio.TimeoutError here https://github.com/zalando-incubator/kopf/blob/d74a0638da01037d65e8b6c7b041c4f9c2e257cd/kopf/reactor/queueing.py#L178-L182 similar to how it is handled separately several times in the code before that, and log only an error, not an exception (or may be even just a warning)

Steps to Reproduce the Problem

unfortunately can't really pin it down, may be just a performance issue (currently seen on our internal CI that has a lot crammed into several VMs with k8s).

Specifications

  • Platform: Ubuntu 18.04
  • Kubernetes version: 1.15
  • Python version: 3.7.3
  • Python packages installed: aiohttp==3.6.2,aiojobs==0.2.2,async-timeout==3.0.1,attrs==19.3.0,certifi==2019.11.28,chardet==3.0.4,Click==7.0,idna==2.8,iso8601==0.1.12,kopf==0.24,multidict==4.7.2,pykube-ng==19.12.0,PyYAML ==5.2,requests==2.22.0,typing-extensions==3.7.4.1,urllib3==1.25.7,yarl==1.4.2

pshchelo avatar Jan 02 '20 15:01 pshchelo