garden-runc-release
garden-runc-release copied to clipboard
Document CPU entitlement
The CPU entitlements feature can be quite tricky to understand. We want to document both its behaviour and its implementation.
For the behaviour, we have some good starting points:
- CPU Entitlements in Cloud Foundry (video)
- A Better Way to Split the Cake: CPU Entitlements (blog post)
- The CPU Entitlement doc
Let's also remember to mention the CF CLI plugins!
For the implementation, let's try to describe:
- how CPU shares work in the kernel
- how we use nested CPU cgroups to implement throttling
Had a read of what needs to be done. Have a rough outline of what the doc should include:
CPU Entitlement and Throttling
CPU entitlements and cpu throttling are two interconnected features in Garden, designed to improve how CPU resources are distributed to applications in Cloud Foundry.
What does this look like for users?
- overview - maybe just link to video/blog post as they are high level
- explain the problem - the default cpu metric is innacurate
- what happens when with spare resources currently
- what is entitlement
- how do the spare resources get distributed now
How is it implemented?
Entitlement
- how we calculate the value
- what we emmit to loggregator
- the cpu entitlement plugins
- mention how to enable it in a deployment
- explain what entitlement_per_share is and what happens with different values https://www.pivotaltracker.com/story/show/171307969
- explain what happens when the bad apps are more than the good ones and how we fixed this https://www.pivotaltracker.com/story/show/171050093
Throttling
- explanation on how CPU shares work https://kernel.googlesource.com/pub/scm/linux/kernel/git/glommer/memcg/+/cpu_stat/Documentation/cgroups/cpu.txt
- explain good and bad cgroups
- explain how we move between those cgroups
Runtime tracker story has been generated for this issue: https://www.pivotaltracker.com/story/show/180812792
Hello @ameowlia and @MarcPaquette. Is this getting worked on? What's the current state? We have it in our backlog so I assume somebody made a commitment to work on this from our end. Is it actionable?
Hi @sleepychild,
This issue is in our icebox at this time, but we don't have any immediate plans to work on this. I can bring it up in IPM next week to discuss.
Thanks
Hi @MarcPaquette,
There was a discussion between @ameowlia and our PO and they agreed that our team is going to work on this documentation.
Hello @ameowlia and @MarcPaquette,
Could you please take a look at the following PRs: https://github.com/cloudfoundry/docs-cf-admin/pull/220 https://github.com/cloudfoundry/docs-loggregator/pull/64
Thank you in advance.
Merged!