dask-kubernetes icon indicating copy to clipboard operation
dask-kubernetes copied to clipboard

Service account cannot patch resource `daskautoscalers/scale`

Open karlkovaciny opened this issue 11 months ago • 1 comments

Describe the issue: As far as I know, this happened without so much as updating a dependency. When creating a KubeCluster, I get a stack trace saying my service account is forbidden to patch resource daskautoscalers/scale.

Encountered exception during execution:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/dask_kubernetes/operator/kubecluster/kubecluster.py", line 798, in _adapt
    await custom_objects_api.patch_namespaced_custom_object_scale(
...
kubernetes_asyncio.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: <CIMultiDictProxy('Audit-Id': '02d05f3c-e745-494f-aa17-c209fc5e86ea', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '1b0c10b1-bc45-43a2-ab7f-d55b26694784', 'X-Kubernetes-Pf-Prioritylevel-Uid': '413b4e74-5db3-4080-a597-b71d6bf53576', 'Date': 'Wed, 26 Jul 2023 05:55:20 GMT', 'Content-Length': '454')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"daskautoscalers.kubernetes.dask.org \"my-cluster-spec-name\" is forbidden: 
User \"system:serviceaccount:my-namespace:dask-sa\" cannot patch resource \"daskautoscalers/scale\" 
in API group \"kubernetes.dask.org\" in the namespace \"my-namespace\"",
"reason":"Forbidden","details":{"name":"my-cluster-spec-name","group":"kubernetes.dask.org","kind":"daskautoscalers"},"code":403}



During handling of the above exception, another exception occurred:
...
kubernetes_asyncio.client.exceptions.ApiException: (409)
Reason: Conflict
HTTP response headers: <CIMultiDictProxy('Audit-Id': '544d9095-e161-410c-8902-967163bd0eb5', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'Warning': '299 - "unknown field \\"metadata.dask.org/cluster-name\\""', 'Warning': '299 - "unknown field \\"metadata.dask.org/component\\""', 'X-Kubernetes-Pf-Flowschema-Uid': '1b0c10b1-bc45-43a2-ab7f-d55b26694784', 'X-Kubernetes-Pf-Prioritylevel-Uid': '413b4e74-5db3-4080-a597-b71d6bf53576', 'Date': 'Wed, 26 Jul 2023 05:55:20 GMT', 'Content-Length': '272')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"daskautoscalers.kubernetes.dask.org \"my-cluster-spec-name\" already exists",
"reason":"AlreadyExists","details":{"name":"my-cluster-spec-name","group":"kubernetes.dask.org","kind":"daskautoscalers"},"code":409}

Minimal Complete Verifiable Example: Sorry, I don't have an MCVE.

Anything else we need to know?: The first thing I tried was changing this line of my DaskClusterRole to add the disallowed subresource at the end:

resources: [daskclusters, daskworkergroups, daskworkergroups/scale, daskjobs, daskautoscalers, daskautoscalers/scale]

That just caused the error to switch to

{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"daskautoscalers.kubernetes.dask.org \"my-cluster-spec-name\" not found",
"reason":"NotFound","details":{"name":"init-report","group":"kubernetes.dask.org","kind":"daskautoscalers"},
"code":404}

Environment:

  • Dask version: Experienced with both an older and the current version:
dask = {extras = ["distributed"], version = "^2023.4.0"}
dask-kubernetes = "^2023.3.2"
---
dask = {extras = ["distributed"], version = "^2023.7.1"}
dask-kubernetes = "^2023.7.2"
  • Python version: 3.10.9
  • Operating System: Amazon Linux 2?
  • Install method (conda, pip, source): Poetry

karlkovaciny avatar Jul 26 '23 06:07 karlkovaciny

Thanks for reporting this. We will need an MVCE to get to the bottom of this. Perhaps you could share some details like:

  • How did you install the operator?
  • You mention you are using a service account when using KubeCluster. What permissions does it have? How was it created?
  • What code are you running? I guess it's something like cluster = KubeCluster(...); cluster.scale(n).

jacobtomlinson avatar Jul 26 '23 13:07 jacobtomlinson

Closing due to inactivity

jacobtomlinson avatar Apr 30 '24 15:04 jacobtomlinson