kopf
kopf copied to clipboard
Kopf crashes when there are disabled APIServers.
Long story short
I've come across this issue when I was trying to run my operator inside a kubernetes cluster that has linkerd.io as its service mesh. the thing is it is not setup correctly so the team decided to disable the api via --runtime-config
. now the /apis/tap.linkerd.io/v1alpha1/
returns 503 errors.
Normally I would like to ignore this error as the kubectl
does, when I list pods it shows a little warning that tap.linkerd.io is not available and then shows me the list of pods.
But I noticed that kopf keeps getting crashed.
I have tried settings.scanning.disabled = True
but that did not help, although I thought it would while reading the docs.
Kopf version
1.36.2
Kubernetes version
1.23.13
Python version
3.11
Code
No response
Logs
/usr/local/lib/python3.11/site-packages/kopf/_core/reactor/running.py:179: FutureWarning: Absence of either namespaces or cluster-wide flag will become an error soon. For now, switching to the cluster-wide mode for backward compatibility.
warnings.warn("Absence of either namespaces or cluster-wide flag will become an error soon."
[2023-10-29 18:31:15,038] kopf.activities.star [INFO ] Activity 'startup_config' succeeded.
[2023-10-29 18:31:15,128] kopf._core.engines.a [INFO ] Initial authentication has been initiated.
[2023-10-29 18:31:15,130] kopf.activities.auth [INFO ] Activity 'login_fn' succeeded.
[2023-10-29 18:31:15,130] kopf._core.engines.a [INFO ] Initial authentication has finished.
[2023-10-29 18:31:16,932] kopf._core.reactor.o [ERROR ] Request attempt #1/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:19,133] kopf._core.reactor.o [ERROR ] Request attempt #2/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:20,236] kopf._core.reactor.o [ERROR ] Request attempt #3/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:22,335] kopf._core.reactor.o [ERROR ] Request attempt #4/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:25,347] kopf._core.reactor.o [ERROR ] Request attempt #5/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:30,437] kopf._core.reactor.o [ERROR ] Request attempt #6/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:38,454] kopf._core.reactor.o [ERROR ] Request attempt #7/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:31:51,477] kopf._core.reactor.o [ERROR ] Request attempt #8/9 failed; will retry: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:32:12,502] kopf._core.reactor.o [ERROR ] Request attempt #9/9 failed; escalating: GET https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1 -> APIServerError(None, None)
[2023-10-29 18:32:12,538] kopf._core.reactor.r [ERROR ] Resource observer has failed: (None, None)
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
response.raise_for_status()
File "/usr/local/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status
raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 503, message='Service Unavailable', url=URL('https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 108, in guard
await coro
File "/usr/local/lib/python3.11/site-packages/kopf/_core/reactor/observation.py", line 113, in resource_observer
resources = await scanning.scan_resources(groups=group_filter, settings=settings, logger=logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/scanning.py", line 31, in scan_resources
resources.update(await coro)
^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/tasks.py", line 605, in _wait_for_one
return f.result() # May raise f.exception().
^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/scanning.py", line 83, in _read_new_apis
resources.update(await coro)
^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/tasks.py", line 605, in _wait_for_one
return f.result() # May raise f.exception().
^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/scanning.py", line 97, in _read_version
rsp = await api.get(url, settings=settings, logger=logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/api.py", line 111, in get
response = await request(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
return await fn(*args, **kwargs, context=context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/api.py", line 85, in request
await errors.check_response(response) # but do not parse it!
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIServerError: (None, None)
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
response.raise_for_status()
File "/usr/local/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1005, in raise_for_status
raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 503, message='Service Unavailable', url=URL('https://kubernetes.default.svc/apis/tap.linkerd.io/v1alpha1')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/kopf", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/cli.py", line 60, in wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/decorators.py", line 92, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/cli.py", line 109, in run
return running.run(
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_core/reactor/running.py", line 81, in run
asyncio.run(coro)
File "/usr/local/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_core/reactor/running.py", line 138, in operator
await run_tasks(operator_tasks, ignored=existing_tasks)
File "/usr/local/lib/python3.11/site-packages/kopf/_core/reactor/running.py", line 419, in run_tasks
await aiotasks.reraise(root_done | root_cancelled | hung_done | hung_cancelled)
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 238, in reraise
task.result() # can raise the regular (non-cancellation) exceptions.
^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 108, in guard
await coro
File "/usr/local/lib/python3.11/site-packages/kopf/_core/reactor/observation.py", line 113, in resource_observer
resources = await scanning.scan_resources(groups=group_filter, settings=settings, logger=logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/scanning.py", line 31, in scan_resources
resources.update(await coro)
^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/tasks.py", line 605, in _wait_for_one
return f.result() # May raise f.exception().
^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/scanning.py", line 83, in _read_new_apis
resources.update(await coro)
^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/tasks.py", line 605, in _wait_for_one
return f.result() # May raise f.exception().
^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/scanning.py", line 97, in _read_version
rsp = await api.get(url, settings=settings, logger=logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/api.py", line 111, in get
response = await request(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
return await fn(*args, **kwargs, context=context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/api.py", line 85, in request
await errors.check_response(response) # but do not parse it!
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIServerError: (None, None)
Additional information
No response
@mehrdad-khojastefar I'm also facing the exact same issue in our k8s cluster
@mehrdad-khojastefar were you able to find any workaround on this issue?
@prabhatkgupta I was able to fix it, you can take a look at it here https://github.com/mehrdad-khojastefar/kopf As you can tell I hadn't had time to make it a proper pull request :), I am using this version in production and it hadn't have problems since. Please review it and use it with caution. I don't suggest to use it everywhere without testing and ... . I will make it a proper pull request in the upcomming weeks.
@mehrdad-khojastefar how can I use your code in my docker?
@prabhatkgupta https://gist.github.com/javrasya/e95ade856ff42e4649972f8a54368459 This would help. you need to modify requirements.txt file and rebuild your docker image
@mehrdad-khojastefar tried to pip install from your github repo, facing the following issue
Traceback (most recent call last):
File "/usr/local/bin/kopf", line 5, in <module>
from kopf.cli import main
File "/usr/local/lib/python3.9/site-packages/kopf/__init__.py", line 117, in <module>
from kopf._core.engines.admission import (
File "/usr/local/lib/python3.9/site-packages/kopf/_core/engines/admission.py", line 14, in <module>
from kopf._cogs.clients import creating, errors, patching
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/creating.py", line 3, in <module>
from kopf._cogs.clients import api
File "/usr/local/lib/python3.9/site-packages/kopf/_cogs/clients/api.py", line 55, in <module>
) -> aiohttp.ClientResponse | None:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'