runc
runc copied to clipboard
libcontainer: set pids limit to max when set to 0
potentially fixes https://github.com/opencontainers/runc/issues/4014
Some tests need to be fixed.
After you fix all the errors, I think you should squash all your commits to one, and then add an integration test for PidsLimit
to avoid this error when someone refactor the code.
I can't see cs9 failure but I'm hoping it's a flake? regardless, comments addressed, PTAL
I can't see cs9 failure but I'm hoping it's a flake?
not ok 19 runc run (cgroup v2 resources.unified override)
# (in test file tests/integration/cgroups.bats, line 262)
# `[ "$status" -eq 0 ]' failed
# runc spec (status=0):
#
# runc run -d --console-socket /tmp/bats-run-YlegjG/runc.T1kloa/tty/sock test_cgroups_unified (status=1):
# time="2023-09-15T13:29:09Z" level=error msg="runc run failed: unable to start container process: container init was OOM-killed (memory limit too low?)"
I think maybe we need #3987 merged to fix this issue.
I think maybe we need #3987 merged to fix this issue.
CI has been fixed by #4020
The current runc code treats PidsLimit value like this:
-
> 0
-- set to specific value -
-1
(or, in some cases, any negative value) -- set to "max" - `== 0' -- unset
So, we have the "unset" case without using a pointer.
The only issue is, we should properly convert runtime-spec representation into internal representation, essentially meaning, if spec value is <=0
, set PidsLimit to -1
.
I.e. something like this might be sufficient to fix the bug:
--- a/libcontainer/specconv/spec_linux.go
+++ b/libcontainer/specconv/spec_linux.go
@@ -772,7 +772,13 @@ func CreateCgroupConfig(opts *CreateOpts, defaultDevs []*devices.Device) (*confi
c.Resources.CPUIdle = r.CPU.Idle
}
if r.Pids != nil {
- c.Resources.PidsLimit = r.Pids.Limit
+ if r.Pids.Limit > 0 {
+ c.Resources.PidsLimit = r.Pids.Limit
+ } else {
+ // Convert from runtime-spec (where a value <= 0 means "max")
+ // to internal (where -1 is "max", and 0 is "unset").
+ c.Resources.PidsLimit = -1
+ }
}
if r.BlockIO != nil {
if r.BlockIO.Weight != nil {
--- a/libcontainer/configs/cgroup_linux.go
+++ b/libcontainer/configs/cgroup_linux.go
@@ -90,7 +90,7 @@ type Resources struct {
// cgroup SCHED_IDLE
CPUIdle *int64 `json:"cpu_idle,omitempty"`
- // Process limit; set <= `0' to disable limit.
+ // Process limit; 0 is "unset", -1 is "max".
PidsLimit int64 `json:"pids_limit"`
// Specifies per cgroup weight, range is from 10 to 1000.
~~(alas, we probably also need the same code in update.go
)~~
WDYT @lifubang @haircommander? I am more comfortable with a change like this merely because it is smaller and does not introduce a potential panic when dereferening a nil pointer.
because it is smaller
Yes, it is indeed a smaller change with your code. But I think github.com/opencontainers/runc/libcontainer/cgroups
is used by some upstream projects, they must know this knowledge before they write the code, so I think maybe we should remain it as is?
WDYT @kolyshkin
Yes, it is indeed a smaller change with your code. But I think
github.com/opencontainers/runc/libcontainer/cgroups
is used by some upstream projects, they must know this knowledge before they write the code, so I think maybe we should remain it as is?
That reminds me, changing Limit
field to be a pointer is a breaking change for all those projects. We said we do not guarantee that API will stay compatible, so if there is a reason to break it, I would rather do so, but in this case I don't see a reason. Currently we do support both "unlimited" and "unset" cases, and I guess existing users of libcontainer already know how to use these. So, nothing needs to be changed here.
I'll keep looking into it.
is a breaking change for all those projects
I think it can't be defined as a breaking change, we don't break any result, this is just only a change of type, when they update to the new version of libcontainer
, they know this change, because the code can't be compiled successfully.
but in this case I don't see a reason
There is a reason, if keep it as a raw type int64
, the value will be always 0, so we will always write max
to pids.max
when we call Manager.Set
, it is unnecessary.
is a breaking change for all those projects
I think it can't be defined as a breaking change, we don't break any result, this is just only a change of type, when they update to the new version of
libcontainer
, they know this change, because the code can't be compiled successfully.
This is exactly what a breaking change is. Your code worked, you updated a dependency, and now it won't even compile.
Yes, we've done it in the past and we'll do it again, but this should be justified.
but in this case I don't see a reason
There is a reason, if keep it as a raw type
int64
, the value will be always 0, so we will always writemax
topids.max
when we callManager.Set
, it is unnecessary.
Not necessarily. Current code treats PidsLimit==0
as unset, which is the behavior I intend to keep. In other words, I think the fix should belong to specconv (this is what I'm doing in #4023 -- let me know what you think).
I am happy either way as long as the behavior when setting 0 to pids limit is well defined :)