Kubernetes-acs-engine-autoscaler icon indicating copy to clipboard operation
Kubernetes-acs-engine-autoscaler copied to clipboard

Fix cordon/uncordon logic

Open wbuchwalter opened this issue 7 years ago • 4 comments

Fix #7.

  • Add a --util-threshold parameter to control the percentage of CPU utilization for a node under which it should be cordoned.
  • The autoscaler will now uncordon existing unschedulable nodes before creating new ones

wbuchwalter avatar Aug 04 '17 04:08 wbuchwalter

Awesome! We can test this hopefully later this week?

Note that cpu utilization is probably useless to us tho. pod count is what we mostly care about...

yuvipanda avatar Aug 08 '17 23:08 yuvipanda

(but I understand if we're too specific a use case)

yuvipanda avatar Aug 08 '17 23:08 yuvipanda

@yuvipanda Sorry for the very long delay in responding. Was on vacation with almost no internet access.

Note that CPU utilization in this context is not the real CPU usage of the VM in real-time but what is reserved on the node through requests and limits. I assume in your case each pod should have the same amount of CPU assigned?

Assuming for example that you want to assign 2 CPUs per pod, that you are using NC24 and that you want to cordon any node with less than 2 pods, you could set --util-threshold to be 1/6th to achieve what you want. Could this work in your case?

wbuchwalter avatar Sep 01 '17 13:09 wbuchwalter

Heya! We don't actually set CPU limits or guarantees, only memory ones. That's worked out so far so good for us, although we realize it isn't a best practice :D We could probably fix that and add CPU limits too and then this could wrok...

We haven't tested this out at all though - we've just over-provisioned our cluster for now...

yuvipanda avatar Sep 02 '17 17:09 yuvipanda