kopf
kopf copied to clipboard
Kopf stops receiving namespace events
Expected Behavior
Kopf should actively receive all namespace events.
@kopf.on.event('', 'v1', 'namespaces')
async def handle_event(event, **kwargs):
logger = kwargs["logger"]
logger.debug(f"Event: {event}. Cause: {kwargs.get('cause')}.")
Actual Behavior
Kopf receives events for a while and then stops receiving events. Neither create
, update
and delete
events handlers are triggered nor do the events show up in the raw event
handler.
Steps to Reproduce the Problem
- Set up kopf to listen to Namespace events (as shown above)
- Log a message when events occur
- Create and delete namespaces on a cron (once an hour or so). Notice that Kopf stops receiving events after a period of time.
Specifications
- Platform: Azure Kubernetes Service
- Kubernetes version: 1.13.10
- Python version: python:3.8.0-slim-buster
- Python packages installed: kopf requests requests_oauthlib parse
@logicfox Can you please add the Kopf's version too? pip freeze | grep kopf
or kopf --version
Sure
kopf==0.22
Maybe a duplicate of #204 #142 (not certain though).
@logicfox Can you please try it with kopf>=0.23rc2
? Specifically, kopf==0.23rc1
switches all the I/O internally to asyncio+aiohttp (#227). This already solved some issues with the synchronous sockets freezing in some cases, and maybe solves all the other issues with similar symptoms.
Please, be aware of the massive changes in this RC (see 0.23rc1 & optionally 0.23rc2 release notes) if you have a pre-existing operator, which can be affected — though, in theory, it should be fully backward compatible and safe, but who knows what can break in practice.
@nolar Sorry, I couldn't test this earlier. But it looks like the problem is still there in the master
branch. watch
seems to freeze after a while. I'm going to test this with the raw Kubernetes Python client to see if it's an issue with my cluster.
We experienced the same issue until we upgraded Kubernetes to 1.15.10 in AKS. In addition I changed the version of Kopf from 0.25 to 0.26.
To the situation before: I noticed that events for CRDs were still being received.
Not sure if this is related,on [email protected] and [email protected] I tried :
@kopf.on.login()
def login_fn(**kwargs):
# return kopf.login_via_client(**kwargs)
return kopf.login_via_pykube(**kwargs)
@kopf.on.event('', 'v1', 'namespaces')
# @kopf.on.create('core', 'v1', 'namespaces')
Results in
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('{eksUrl}/api/v1/namespaces')
In the same promt, kubectl get namespaces
works....
upgraded kopf to 27rc5, got it working with:
@kopf.on.login()
def login_fn(**kwargs):
return kopf.login_via_client(**kwargs)
@kopf.on.create('', 'v1', 'namespaces')
By default there is no timeout on timeoutSeconds for watch session is not set neither in kopf https://github.com/zalando-incubator/kopf/blob/master/kopf/structs/configuration.py#L68 or kubernetes API https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/ as result the session might stuck forewer. setting watching.server_timeout to some value might help here. It is important to set server_timeout to value less than watching.client_timeout (which is aiohttp session global timeout)
@kopf.on.startup() def configure(settings: kopf.OperatorSettings, **_): settings.watching.server_timeout = 300
I think not only watching session might stuck, as other calls doesn't have default timeout configured. I've proposed to set timeouts globally per aiohttp session https://github.com/zalando-incubator/kopf/pull/377 but looks like it is not possible to override settings in the way propowed in patch, so it have to be updated.