runc icon indicating copy to clipboard operation
runc copied to clipboard

libcontainer: set pids limit to max when set to 0

Open haircommander opened this issue 9 months ago • 10 comments

potentially fixes https://github.com/opencontainers/runc/issues/4014

haircommander avatar Sep 12 '23 20:09 haircommander

Some tests need to be fixed. After you fix all the errors, I think you should squash all your commits to one, and then add an integration test for PidsLimit to avoid this error when someone refactor the code.

lifubang avatar Sep 14 '23 04:09 lifubang

I can't see cs9 failure but I'm hoping it's a flake? regardless, comments addressed, PTAL

haircommander avatar Sep 14 '23 17:09 haircommander

I can't see cs9 failure but I'm hoping it's a flake?

not ok 19 runc run (cgroup v2 resources.unified override)
# (in test file tests/integration/cgroups.bats, line 262)
#   `[ "$status" -eq 0 ]' failed
# runc spec (status=0):
#
# runc run -d --console-socket /tmp/bats-run-YlegjG/runc.T1kloa/tty/sock test_cgroups_unified (status=1):
# time="2023-09-15T13:29:09Z" level=error msg="runc run failed: unable to start container process: container init was OOM-killed (memory limit too low?)"

I think maybe we need #3987 merged to fix this issue.

lifubang avatar Sep 15 '23 13:09 lifubang

I think maybe we need #3987 merged to fix this issue.

CI has been fixed by #4020

lifubang avatar Sep 19 '23 03:09 lifubang

The current runc code treats PidsLimit value like this:

  • > 0 -- set to specific value
  • -1 (or, in some cases, any negative value) -- set to "max"
  • `== 0' -- unset

So, we have the "unset" case without using a pointer.

The only issue is, we should properly convert runtime-spec representation into internal representation, essentially meaning, if spec value is <=0, set PidsLimit to -1.

I.e. something like this might be sufficient to fix the bug:

--- a/libcontainer/specconv/spec_linux.go
+++ b/libcontainer/specconv/spec_linux.go
@@ -772,7 +772,13 @@ func CreateCgroupConfig(opts *CreateOpts, defaultDevs []*devices.Device) (*confi
                                c.Resources.CPUIdle = r.CPU.Idle
                        }
                        if r.Pids != nil {
-                               c.Resources.PidsLimit = r.Pids.Limit
+                               if r.Pids.Limit > 0 {
+                                       c.Resources.PidsLimit = r.Pids.Limit
+                               } else {
+                                       // Convert from runtime-spec (where a value <= 0 means "max")
+                                       // to internal (where -1 is "max", and 0 is "unset").
+                                       c.Resources.PidsLimit = -1
+                               }
                        }
                        if r.BlockIO != nil {
                                if r.BlockIO.Weight != nil {
--- a/libcontainer/configs/cgroup_linux.go
+++ b/libcontainer/configs/cgroup_linux.go
@@ -90,7 +90,7 @@ type Resources struct {
        // cgroup SCHED_IDLE
        CPUIdle *int64 `json:"cpu_idle,omitempty"`
 
-       // Process limit; set <= `0' to disable limit.
+       // Process limit; 0 is "unset", -1 is "max".
        PidsLimit int64 `json:"pids_limit"`
 
        // Specifies per cgroup weight, range is from 10 to 1000.

~~(alas, we probably also need the same code in update.go)~~

WDYT @lifubang @haircommander? I am more comfortable with a change like this merely because it is smaller and does not introduce a potential panic when dereferening a nil pointer.

kolyshkin avatar Sep 21 '23 23:09 kolyshkin

because it is smaller

Yes, it is indeed a smaller change with your code. But I think github.com/opencontainers/runc/libcontainer/cgroups is used by some upstream projects, they must know this knowledge before they write the code, so I think maybe we should remain it as is? WDYT @kolyshkin

lifubang avatar Sep 22 '23 02:09 lifubang

Yes, it is indeed a smaller change with your code. But I think github.com/opencontainers/runc/libcontainer/cgroups is used by some upstream projects, they must know this knowledge before they write the code, so I think maybe we should remain it as is?

That reminds me, changing Limit field to be a pointer is a breaking change for all those projects. We said we do not guarantee that API will stay compatible, so if there is a reason to break it, I would rather do so, but in this case I don't see a reason. Currently we do support both "unlimited" and "unset" cases, and I guess existing users of libcontainer already know how to use these. So, nothing needs to be changed here.

I'll keep looking into it.

kolyshkin avatar Sep 22 '23 02:09 kolyshkin

is a breaking change for all those projects

I think it can't be defined as a breaking change, we don't break any result, this is just only a change of type, when they update to the new version of libcontainer, they know this change, because the code can't be compiled successfully.

but in this case I don't see a reason

There is a reason, if keep it as a raw type int64, the value will be always 0, so we will always write max to pids.max when we call Manager.Set, it is unnecessary.

lifubang avatar Sep 22 '23 02:09 lifubang

is a breaking change for all those projects

I think it can't be defined as a breaking change, we don't break any result, this is just only a change of type, when they update to the new version of libcontainer, they know this change, because the code can't be compiled successfully.

This is exactly what a breaking change is. Your code worked, you updated a dependency, and now it won't even compile.

Yes, we've done it in the past and we'll do it again, but this should be justified.

but in this case I don't see a reason

There is a reason, if keep it as a raw type int64, the value will be always 0, so we will always write max to pids.max when we call Manager.Set, it is unnecessary.

Not necessarily. Current code treats PidsLimit==0 as unset, which is the behavior I intend to keep. In other words, I think the fix should belong to specconv (this is what I'm doing in #4023 -- let me know what you think).

kolyshkin avatar Sep 22 '23 03:09 kolyshkin

I am happy either way as long as the behavior when setting 0 to pids limit is well defined :)

haircommander avatar Sep 25 '23 20:09 haircommander