runc icon indicating copy to clipboard operation
runc copied to clipboard

cgroup2: confusing error when `linux.resources.cpu.shares=1`

Open corhere opened this issue 8 months ago • 2 comments

When passed a spec with an out-of-range CPU shares value, runc will fail to start the container and reports an error message. On a cgroup1 host the returned error message makes it clear which part of the container config is the problem.

minimum allowed cpu-shares is 2

However, on a cgroup2 host, the error message makes no mention of CPU shares.

failed to write "70369281052672": write /sys/fs/cgroup/.../cpu.weight: numerical result out of range
$ docker run --rm --cpu-shares 1 hello-world
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting cgroup config for procHooks process: failed to write "70369281052672": write /sys/fs/cgroup/docker/4139b57a20695ff4f62dda799b1c4052791f59bac4611534cb3a213a3e446add/cpu.weight: numerical result out of range: unknown.

It is not at all obvious that a CPU weight of 70369281052672 has anything to do with "shares": 1 in the container config.

Expected: runc returns an error message which flags the CPU shares configuration as being out of range, irrespective of the host system's configuration.

Why this is a runc bug

  • According to the runtime-spec, it is valid for shares to be set to any value within the range of a uint64, [0, 2^32). Therefore it is not a bug in whatever produces the spec to set the CPU shares to 1.
  • runc (libcontainer) is the party responsible for mapping the cpu-shares value in the container config to a cpu-weight value for cgroups v2. https://github.com/opencontainers/runc/blob/8d90e3dba696ac787ee64de4445517ddf1063b04/libcontainer/specconv/spec_linux.go#L826-L831 https://github.com/opencontainers/cgroups/blob/9657f5a18b8d60a0f39fbb34d0cb7771e28e6278/fs2/cpu.go#L30
  • The ConvertCPUSharesToCgroupV2Value function is documented to only take inputs in the range [2, 262144]. Calling the function with values outside that range is therefore a bug in the caller, by definition. https://github.com/opencontainers/cgroups/blob/9657f5a18b8d60a0f39fbb34d0cb7771e28e6278/utils.go#L416-L426
  • runc calls the conversion function without validating that its argument is within the valid range. Therefore the bug is in runc for not validating. QED.

corhere avatar May 01 '25 20:05 corhere

Good analysis, thanks @corhere.

Fix part 1: https://github.com/opencontainers/cgroups/pull/17

(part 2 is to switch to the new conversion function and return an error early)

kolyshkin avatar May 13 '25 23:05 kolyshkin

It seems the cgroups PR was included in 0.0.3 and it's included in runc main. Shall we backport it and close this? WDYT @kolyshkin ?

rata avatar Jul 14 '25 10:07 rata