kubernikus icon indicating copy to clipboard operation
kubernikus copied to clipboard

Keep nodes while deorbiter is running

Open databus23 opened this issue 3 years ago • 0 comments

Deleting volumes in the deorbiter is failing for multiple minutes because the nodes are gone and the csi daemonset is not unmounting the volumes. This leads to very long cluster deletion times in our soak test:

...
--- PASS: TestRunner/Cleanup/Cluster/IsDeleted (576.01s)
...

This changes lets the launch controller wait for the deorbiter to finish work before starting to terminate nodes.

For the volumes to actually be deleted we need to remove pods that hold a reference to the pvc how that the nodes stay around.

With this change the deletion time is down consideribly:

...
--- PASS: TestRunner/Cleanup/Cluster/IsDeleted (126.01s)
...

(I think deletion can be even faster, its currently limited by the 120s wait for loadbalancer deletion)

Open question:

  • [ ] What about deployments/replicasets/statefulsets creating new pods with pvc references. (Maybe cordening all nodes is sufficient?)

databus23 avatar Nov 22 '22 12:11 databus23