autoscaler
autoscaler copied to clipboard
Autoscaler 1.25 or later: If a node fails to be deleted, the lastScaleDownFailTime is not refresh.
Which component are you using?:
cluster-autoscaler
What version of the component are you using?: autoscaler 1.25
cluster-autoscaler-1.25.0
Component version:
What k8s version are you using (kubectl version
)?:
kubectl version
Output
$ kubectl version
What environment is this in?: hws
What did you expect to happen?: If a node fails to be deleted, the lastScaleDownFailTime will refresh.
What happened instead?:
If the go routine fails to delete a node, the error is not detected and the function still returns nil. Then the lastScaleDownFailTime is not refresh.
This indicates that the scale-down-delay-after-failure parameter does not take effect, but the scale-down-delay-after-delete parameter takes effect.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
/assign
I don't see the attached function in here: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaledown/actuation/actuator.go#L96 Can you please help me locate which go routine is referred here?
I don't see the attached function in here: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/scaledown/actuation/actuator.go#L96 Can you please help me locate which go routine is referred here?
these two function will never collect errors, even if the node fails to be deleted.
/triage accepted
Hey @gjtempleton Can i work on this issue?