dask-kubernetes
dask-kubernetes copied to clipboard
Service account cannot patch resource `daskautoscalers/scale`
Describe the issue:
As far as I know, this happened without so much as updating a dependency. When creating a KubeCluster, I get a stack trace saying my service account is forbidden to patch resource daskautoscalers/scale
.
Encountered exception during execution:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/dask_kubernetes/operator/kubecluster/kubecluster.py", line 798, in _adapt
await custom_objects_api.patch_namespaced_custom_object_scale(
...
kubernetes_asyncio.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: <CIMultiDictProxy('Audit-Id': '02d05f3c-e745-494f-aa17-c209fc5e86ea', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '1b0c10b1-bc45-43a2-ab7f-d55b26694784', 'X-Kubernetes-Pf-Prioritylevel-Uid': '413b4e74-5db3-4080-a597-b71d6bf53576', 'Date': 'Wed, 26 Jul 2023 05:55:20 GMT', 'Content-Length': '454')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"daskautoscalers.kubernetes.dask.org \"my-cluster-spec-name\" is forbidden:
User \"system:serviceaccount:my-namespace:dask-sa\" cannot patch resource \"daskautoscalers/scale\"
in API group \"kubernetes.dask.org\" in the namespace \"my-namespace\"",
"reason":"Forbidden","details":{"name":"my-cluster-spec-name","group":"kubernetes.dask.org","kind":"daskautoscalers"},"code":403}
During handling of the above exception, another exception occurred:
...
kubernetes_asyncio.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: <CIMultiDictProxy('Audit-Id': '544d9095-e161-410c-8902-967163bd0eb5', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Warning': '299 - "unknown field \\"metadata.dask.org/cluster-name\\""', 'Warning': '299 - "unknown field \\"metadata.dask.org/component\\""', 'X-Kubernetes-Pf-Flowschema-Uid': '1b0c10b1-bc45-43a2-ab7f-d55b26694784', 'X-Kubernetes-Pf-Prioritylevel-Uid': '413b4e74-5db3-4080-a597-b71d6bf53576', 'Date': 'Wed, 26 Jul 2023 05:55:20 GMT', 'Content-Length': '272')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"daskautoscalers.kubernetes.dask.org \"my-cluster-spec-name\" already exists",
"reason":"AlreadyExists","details":{"name":"my-cluster-spec-name","group":"kubernetes.dask.org","kind":"daskautoscalers"},"code":409}
Minimal Complete Verifiable Example: Sorry, I don't have an MCVE.
Anything else we need to know?: The first thing I tried was changing this line of my DaskClusterRole to add the disallowed subresource at the end:
resources: [daskclusters, daskworkergroups, daskworkergroups/scale, daskjobs, daskautoscalers, daskautoscalers/scale]
That just caused the error to switch to
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"daskautoscalers.kubernetes.dask.org \"my-cluster-spec-name\" not found",
"reason":"NotFound","details":{"name":"init-report","group":"kubernetes.dask.org","kind":"daskautoscalers"},
"code":404}
Environment:
- Dask version: Experienced with both an older and the current version:
dask = {extras = ["distributed"], version = "^2023.4.0"}
dask-kubernetes = "^2023.3.2"
---
dask = {extras = ["distributed"], version = "^2023.7.1"}
dask-kubernetes = "^2023.7.2"
- Python version: 3.10.9
- Operating System: Amazon Linux 2?
- Install method (conda, pip, source): Poetry
Thanks for reporting this. We will need an MVCE to get to the bottom of this. Perhaps you could share some details like:
- How did you install the operator?
- You mention you are using a service account when using
KubeCluster
. What permissions does it have? How was it created? - What code are you running? I guess it's something like
cluster = KubeCluster(...); cluster.scale(n)
.
Closing due to inactivity