node-ttl
node-ttl copied to clipboard
Increase node removal speed
Currently node-ttl relies on cluster autoscaler to remove the node from the cluster. More specifically cluster autoscaler will remove the node VM from the cloud provider and then the node controller will remove the node from the cluster. This occurs when the cluster autoscaler considers the node unneeded. The parameter scale-down-unneeded-time
specifies the amount of time the node has to be unneeded before it is removed. The default value for this is 10 minutes. In this case each node will take 10 minutes plus the time to drain the node to remove. One option is to lower the value of scale-down-unneeded-time
, but this is not possible in all cloud providers.
It would be good to look at options which could influence cluster autoscaler to remove the node faster than scale-down-unneeded-time
as we know it should be removed.