dask-kubernetes icon indicating copy to clipboard operation
dask-kubernetes copied to clipboard

support ownerReferences to clean up worker pods after notebook exits

Open yuvipanda opened this issue 5 years ago • 4 comments

Kubernetes offers 'garbage collection' of objects - when an object is deleted, it can automatically delete other objects.

In our case, when a notebook pod that spawned dask workers is deleted, the dask workers should get deleted too.

This is fairly straightforward to implement: https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/

I can't put this in the worker template since you can't seem to affect metadata from there?

yuvipanda avatar Dec 03 '19 21:12 yuvipanda

Ah nice!

There is an assumption here that dask-kubernetes is being used from within a notebook pod. We should think about how to handle situations where that isn't true.

jacobtomlinson avatar Dec 04 '19 09:12 jacobtomlinson

@jacobtomlinson yeah, agree. One easy way to test this would be to allow specifying fields under metadata in worker-template.yaml - not just under spec. That would let folks experiment with this and see where it can go.

yuvipanda avatar Dec 04 '19 17:12 yuvipanda

Sounds good! Is this something you have the time to contribute?

jacobtomlinson avatar Dec 05 '19 09:12 jacobtomlinson

I think this is done now. I've been able to set the ownerReferences from the worker-template.

  dask-config.yaml: |
    kubernetes:
      port: 8786
      name: "dask-worker-{uuid}"
      worker-template:
        metadata:
          ownerReferences:
          - apiVersion: v1
            kind: Pod
            name: $POD_NAME
            controller: true
            uid: $POD_UID

and you need to have the environment variables on the pod that's running the scheduler (in local mode)

          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_UID
            valueFrom:
              fieldRef:
                fieldPath: metadata.uid

I'm using dask==2.23.0 and dask-kubernetes==0.10.1

davidsheldon avatar Aug 26 '20 14:08 davidsheldon

The classic KubeCluster was removed in https://github.com/dask/dask-kubernetes/pull/890. All users will need to migrate to the Dask Operator. Closing.

jacobtomlinson avatar Apr 30 '24 15:04 jacobtomlinson