dotcom-rendering icon indicating copy to clipboard operation
dotcom-rendering copied to clipboard

Add CPU Utilization target scaling policy

Open arelra opened this issue 8 months ago • 4 comments

What does this change?

Follow up to https://github.com/guardian/dotcom-rendering/issues/9311

Adds a CPU utilization based target scaling policy

Why?

We could use this policy in addition to the current latency based step scaling that allows us to meet SLO objectives for each stack.

Notes

  • When combining scaling policies the following rules apply:

    Amazon EC2 Auto Scaling chooses the policy that provides the largest capacity for both scale out and scale in

    I expect the two policies to reach an equilibrium when CPU utilisation is at 40% and we are rendering within our latency targets for each stack.

    However caution is warranted when combining policies as they might conflict and causes a ping-pong effect when scaling-in.

    I will get DevX's input on this approach and if approved we can validate in production.

  • Depending on feedback & testing we may opt to only use target scaling on CPU

  • We only apply scaling policies when the stage is PROD. Hence the current snapshot tests do not output scaling policies to the snapshots as the tests use the stage TEST. I have updated the snapshot tests to use the stage PROD.

arelra avatar Jun 24 '24 15:06 arelra