public-cloud-roadmap icon indicating copy to clipboard operation
public-cloud-roadmap copied to clipboard

Full hour node management/billing for cluster autoscaling

Open nantoniazzi opened this issue 2 years ago • 0 comments

As a kubernetes user,

  • I would like to only pay for hours where the machine are really used
  • Or reuse the same hour billing if a machine is stopped/started multiple time during the same hour
  • Or get a way in the cluster autoscaler to have a better scale down variable (something smarter than scaleDownUneededTimeSeconds).

I do not know what is the easiest to implement for OVH or what is the fairest but the current billing behavior is quite harsh and the workaround are not so easy to implement.

Here is a classic scenario:

  1. The horizontal autoscaler decides to add a new Pod. This pod triggers the spawn of a new Node. The Node billing is starting.
  2. After 20 minutes, the Pod is no longer useful, it's deleted. My node could be stopped.

Strategy A: I chose to stop the Node as soon as they are no longer used. 3. I stop the Node. 4. 5 minutes later, a Pod is added by the autoscaler. A node is required by this Pod. A new node is spawned. 5. After 20 other minutes the Pod is deleted. My node can be stopped too. => I spent the pricing of 2 machines for 40 minutes of usage, and the total duration from the first node spawning to the last node shutdown was 45 minutes (less than 1 hour). If the Pod is spawned/stopped 4 times during the hour, I will pay for 4 hours instead of 1.

So Maybe another strategy can be implemented. I can try to keep the Node activated for a longer period, even if they are not used by any Pod, to avoid paying multiple time for the same hour.

Strategy B: Try to keep the Node longer active: 3. I keep my node started because I set scaleDownUnneededTimeSeconds to 50 minutes. 4. In this scenario, no new pods/nodes are required for multiple hours. After 30 minutes of inactivity, My node is stopped. I kept my Node active for 20 minutes + 50 minutes (waiting). It's more than 1 hour, so I will pay for 2 full hours. (but the reality, is that I only spent 20 minutes of processing, the other 100 minutes for nothing: 50 minutes of scaleDownUnneededTimeSeconds and an extra 50 minutes (stopped machine).

Maybe I can try to tweak the value of scaleDownUnneededTimeSeconds to something smaller, but it's just a question of luck, I do not know how long my Pod is going to be used. If the value is too small, I will go back to the problem of Scenario 1. This value should be adaptative to have the machine turned on for full hours and stopped before the next billing hour if they are not required.

Possible workarounds:

  • I could try to implement custom metrics for the autoscaler to add the full hour usage in the equation. It's really not trivial.
  • Or, I could set the scaleDownUnneededTimeSeconds to a low value (10 seconds) and have a Pod running on all Nodes of the cluster to generate fake CPU activity during Full hours (and if it's the only pod running on the Node). 5 minutes before the end of each full hours, the Pod checks if it is alone to generate CPU activity. If yes, it stops and let the cluster autoscaler delete the Node. This workaround is really not good for the planet, so I won't implement it.

Note: In our use case, we use b2-60 machines, between 6 (low usage) to 30 (high usage) nodes at the same time during the day. Saving 1 hour of unused machines can change a lot the billing for us at the end of the month.

nantoniazzi avatar Apr 08 '22 14:04 nantoniazzi