node-ttl icon indicating copy to clipboard operation
node-ttl copied to clipboard

Increase node removal speed

Open phillebaba opened this issue 2 years ago • 0 comments

Currently node-ttl relies on cluster autoscaler to remove the node from the cluster. More specifically cluster autoscaler will remove the node VM from the cloud provider and then the node controller will remove the node from the cluster. This occurs when the cluster autoscaler considers the node unneeded. The parameter scale-down-unneeded-time specifies the amount of time the node has to be unneeded before it is removed. The default value for this is 10 minutes. In this case each node will take 10 minutes plus the time to drain the node to remove. One option is to lower the value of scale-down-unneeded-time, but this is not possible in all cloud providers.

It would be good to look at options which could influence cluster autoscaler to remove the node faster than scale-down-unneeded-time as we know it should be removed.

phillebaba avatar May 28 '22 16:05 phillebaba