Handle `image`, `env` and `args` fields updates in DaskCluster in k8s operator
I have a permanent dask cluster in kubernetes. Current operator ignores all changes to manifest. There has been an issue about supporting spec updates, it got closed as resolved after implementing scale field support: https://github.com/dask/dask-kubernetes/issues/636.
The only fields that cause changes to deployment after applying updated manifest are spec.worker.replicas, DaskAutoscaler min/max.
Is it possible to support other fields, specifically image, args, env, volumes/mounts?
If not, what could be the optimal way to gracefully shut down and update cluster?
Cluster manifest (mostly copypasted from example):
---
apiVersion: kubernetes.dask.org/v1
kind: DaskCluster
metadata:
name: dask-primary
spec:
worker:
replicas: 1
spec:
containers:
- name: worker
image: "//backend/dask:image"
imagePullPolicy: Always
args:
- worker
- --name
- $(DASK_WORKER_NAME)
- --dashboard
- --dashboard-address
- "8788"
ports:
- name: http-dashboard
containerPort: 8788
protocol: TCP
env:
- name: ENV_1
value: 1
- name: ENV_2
value: 2
volumeMounts:
- name: kafka-certs
mountPath: /etc/ssl/kafka/ca.crt
subPath: ca.crt
readOnly: true
volumes:
- name: kafka-certs
configMap:
name: kafka-certs
scheduler:
spec:
containers:
- name: scheduler
image: "//backend/dask:image"
imagePullPolicy: Always
args:
- scheduler
ports:
- name: tcp-comm
containerPort: 8786
protocol: TCP
- name: http-dashboard
containerPort: 8787
protocol: TCP
readinessProbe:
httpGet:
port: http-dashboard
path: /health
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
port: http-dashboard
path: /health
initialDelaySeconds: 15
periodSeconds: 20
imagePullSecrets:
- name: regcred
service:
type: ClusterIP
selector:
dask.org/cluster-name: dask-primary
dask.org/component: scheduler
ports:
- name: tcp-comm
protocol: TCP
port: 8786
targetPort: "tcp-comm"
- name: http-dashboard
protocol: TCP
port: 8787
targetPort: "http-dashboard"
---
apiVersion: kubernetes.dask.org/v1
kind: DaskAutoscaler
metadata:
name: dask-primary
spec:
cluster: dask-primary
minimum: 1
maximum: 10
Operator version: helm install --repo https://helm.dask.org --create-namespace -n dask-operator --generate-name --version 2024.5.0 dask-kubernetes-operator
Dask version: custom built image that uses the following deps:
dask = "^2024.5.2"
bokeh = "^3.4.1"
distributed = "^2024.5.2"
Although it's the same with 2024.5.2-py3.11 image
Permanent clusters are far less common than ephemeral clusters, so I'm not surprised this hasn't come up before.
I would be happy to propogate other changes if that's valuable to you. Do you have any interest in raising a PR to handle this?
Alternatively you could just delete and recreate the resource?
Dropping daskcluster.kubernetes.dask.org/dask-primary is what I do now. I am concerned about graceful shutdown because scheduler and workers might have pending tasks. Is there a way to do this?
When you delete the cluster all the Pods will be sent a SIGTERM. At this point the Dask scheduler and workers should gracefully shutdown. If they take too long to shut down then Kubernetes will send a SIGTERM, but this timeout is configurable via terminationGracePeriodSeconds.
In a long lived deployment I expect you have some application that runs work on the Dask cluster. If the Dask cluster restarts without completing a computation then it should be the job of the application to resubmit the work.
I would like to mention that I also encountered this issue, currently I have a company CI for updating kubernetes resources via github actions and manually deleting and creating the cluster is not easy due to permissions.
I've also tried to delete the deployments by hand but the operator doesn't seem to notice that the deployments are gone and just continues as normal (with some expections when trying to apply the auto scaling).
Edit: From taking a look at the code, the operator is only listening events to the cluster/group it self and not any of the containers properties. I wonder if it's possible to listen to any change to the daskcluster and find the diffs and patch them over listenting to individual events.
I wonder if it's possible to listen to any change to the daskcluster and find the diffs and patch them over listenting to individual events.
There's no reason why not, this just hasn't been implemented. If you are interested in making a PR then feel free to ping me on it for review.
I'll see if I can find a time slot to work that out