cgroup: change conversion from CPU shares to weight
The OCI CPU shares (range [2-262144]) to cgroup v2 cpu.weight (range [1-10000]) conversion formula has been updated
Closes: https://github.com/containers/crun/issues/1721
Reviewer's Guide
This PR replaces the linear CPU shares→cpu.weight mapping with a log₂-based quadratic formula, adds configure-time log2 detection, updates tests and documentation to match the new conversion, and adjusts build dependencies for Alpine.
Class diagram for updated CPU shares to weight conversion
classDiagram
class CgroupInternal {
+uint64_t convert_shares_to_weight(uint64_t shares)
}
CgroupInternal : +convert_shares_to_weight(uint64_t shares)
CgroupInternal : <<static>>
%% The function now uses log2, pow, and ceil for the new formula
%% No new classes, but the method implementation is significantly changed
File-Level Changes
| Change | Details | Files |
|---|---|---|
| Detect log2 availability in configure.ac with and without -lm |
|
configure.ac |
| Refactor CPU weight tests to cover new mapping edge cases |
|
tests/test_resources.py |
| Implement new log₂-based quadratic conversion in cgroup internals |
|
src/libcrun/cgroup-internal.h |
| Update documentation tables to reflect new conversion formula |
|
crun.1.mdcrun.1 |
| Add missing gperf dependency to Alpine build Dockerfile |
|
tests/alpine-build/Dockerfile |
Tips and commands
Interacting with Sourcery
- Trigger a new review: Comment
@sourcery-ai reviewon the pull request. - Continue discussions: Reply directly to Sourcery's review comments.
- Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with
@sourcery-ai issueto create an issue from it. - Generate a pull request title: Write
@sourcery-aianywhere in the pull request title to generate a title at any time. You can also comment@sourcery-ai titleon the pull request to (re-)generate the title at any time. - Generate a pull request summary: Write
@sourcery-ai summaryanywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment@sourcery-ai summaryon the pull request to (re-)generate the summary at any time. - Generate reviewer's guide: Comment
@sourcery-ai guideon the pull request to (re-)generate the reviewer's guide at any time. - Resolve all Sourcery comments: Comment
@sourcery-ai resolveon the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore. - Dismiss all Sourcery reviews: Comment
@sourcery-ai dismisson the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment@sourcery-ai reviewto trigger a new review!
Customizing Your Experience
Access your dashboard to:
- Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
- Change the review language.
- Add, remove or edit custom review instructions.
- Adjust other review settings.
Getting Help
- Contact our support team for questions or feedback.
- Visit our documentation for detailed guides and information.
- Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.
hmm wouldnt' this need to be opt-in? I think we'll need some way of coordinating this across the ecosystem before a runtime makes this decision
hmm wouldnt' this need to be opt-in? I think we'll need some way of coordinating this across the ecosystem before a runtime makes this decision
I am not going to merge this yet, I've marked it as a Draft and queued it for "crun 2" with a bunch of other breaking changes: https://github.com/containers/crun/milestone/1
I don't see the risk of different runtimes/engines using a different formula. cpu shares/weight are relative to their peer cgroups. e.g. if you've two cgroups (/a/b/container_1 and /a/b/container_2), the weight you set for container_1 affects only container_1 and container_2 but it does not affect in any way b, which in our case is the cgroup created by Kubernetes. So unless you mix runtimes handling the same cgroup level, there should be no conflict.
Also, the entire conversion from shares to weight is an undocumented hack. We should start using the native weight value using the unified configuration in the OCI specs, instead of what cgroup v1 was doing. So in a sense, it is already "drop-in", given there is already a (better) option to specify directly the weight.
TMT tests failed. @containers/packit-build please check.
hmm wouldnt' this need to be opt-in? I think we'll need some way of coordinating this across the ecosystem before a runtime makes this decision
I am not going to merge this yet, I've marked it as a Draft and queued it for "crun 2" with a bunch of other breaking changes: containers/crun/milestone/1
I don't see the risk of different runtimes/engines using a different formula. cpu shares/weight are relative to their peer cgroups. e.g. if you've two cgroups (
/a/b/container_1and/a/b/container_2), the weight you set forcontainer_1affects onlycontainer_1andcontainer_2but it does not affect in any wayb, which in our case is the cgroup created by Kubernetes. So unless you mix runtimes handling the same cgroup level, there should be no conflict.Also, the entire conversion from shares to weight is an undocumented hack. We should start using the native weight value using the
unifiedconfiguration in the OCI specs, instead of what cgroup v1 was doing. So in a sense, it is already "drop-in", given there is already a (better) option to specify directly the weight.
@haircommander any thoughts? If we change that as a purely implementation-detail, I guess we don't need a KEP to drive it? Would that be acceptable?
@haircommander any thoughts? If we change that as a purely implementation-detail, I guess we don't need a KEP to drive it? Would that be acceptable?
the KEP would have impact only the conversion performed by Kubernetes or the CRI runtime. Decisions in this project (and I'd assume runc too) are guided only by the OCI runtime-specs, not a KEP.
Ephemeral COPR build failed. @containers/packit-build please check.
TMT tests failed. @containers/packit-build please check.
@haircommander any thoughts? If we change that as a purely implementation-detail, I guess we don't need a KEP to drive it? Would that be acceptable?
the KEP would have impact only the conversion performed by Kubernetes or the CRI runtime. Decisions in this project (and I'd assume runc too) are guided only by the OCI runtime-specs, not a KEP.
I was referring to whether the CRI would have a way to opt-into (or out of) this to ensure the new conversion takes place on all OCIs at the same k8s version.
Obviously decisions in this repo aren't driven by Kubernetes KEPs, I'm just raising questions about their use-case from a user perspective. I wonder if for the k8s ecosystem (which is a significant user) treating the conversion as an implementation detail is a problem, and how important it is to have an opt-in/out mechanism.
I was referring to whether the CRI would have a way to opt-into (or out of) this to ensure the new conversion takes place on all OCIs at the same k8s version.
Obviously decisions in this repo aren't driven by Kubernetes KEPs, I'm just raising questions about their use-case from a user perspective. I wonder if for the k8s ecosystem (which is a significant user) treating the conversion as an implementation detail is a problem, and how important it is to have an opt-in/out mechanism.
I think we are overthinking it. The current mechanism is quite broken but you are the first one who noticed and complained about it :-) Most people won't notice the difference, which is an improvement anyway.
We need to move completely away from cgroup v1. A real improvement would be how to use cpu.weight instead of relying on the conversion from cpu shares and that makes things completely in charge of Kubernetes, you get the weight value that you specify in the specs.
A real improvement would be how to use cpu.weight instead of relying on the conversion from cpu shares
+100! :+1:
Ephemeral COPR build failed. @containers/packit-build please check.
TMT tests failed. @containers/packit-build please check.
Ephemeral COPR build failed. @containers/packit-build please check.
still LGTM )
I missed that the equivalent patch for runc was merged, so no reasons to postpone this one :-)