kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

Explore replacing opencontainers/runc with containerd/cgroups

Open dims opened this issue 1 year ago • 8 comments

Kubernetes use of opencontainers/runc as a library is placing undue burden on the runc team, for example:

  • https://github.com/opencontainers/runc/issues/3028
  • https://github.com/opencontainers/runc/issues/3221#issuecomment-925972992

We now have a cgroups specific library in containerd org that we can explore to start slowly replacing functionality we needed earlier from runc i think. https://github.com/containerd/cgroups

As of right now k/k master shows the following imports of opencontainers/runc:

❯ rg '"github.com/opencontainers/runc' | grep -v vendor | cut -f 2 -d '"' | sort | uniq -c | sort
      1 github.com/opencontainers/runc/libcontainer/cgroups/systemd
      1 github.com/opencontainers/runc/libcontainer/utils
      2 github.com/opencontainers/runc/libcontainer/apparmor
      2 github.com/opencontainers/runc/libcontainer/cgroups/manager
      2 github.com/opencontainers/runc/libcontainer/configs
      2 github.com/opencontainers/runc/libcontainer/userns
      3 github.com/opencontainers/runc/libcontainer/cgroups/fscommon
     17 github.com/opencontainers/runc/libcontainer/cgroups

/sig node

dims avatar Oct 17 '24 16:10 dims

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Oct 17 '24 16:10 k8s-ci-robot

/area code-organization

dims avatar Oct 17 '24 17:10 dims

in both cases aren't we signing up a different project to support kubernetes and our needs/timelines? Obviously there's some overlap between containerd community and k8s but I don't see how moving to containerd package would solve this problem of k8s relying on a package outside of the k8s repo (and on contributors who may not have k8s as their priority/focus)

I personally think it's better to pull the public libcontainer pieces out of runc and have those versioned separately from runc, so we don't need to wait for runc verison bumps to have a tagged version of libcontainer.

haircommander avatar Oct 17 '24 18:10 haircommander

in both cases aren't we signing up a different project to support kubernetes and our needs/timelines? Obviously there's some overlap between containerd community and k8s but I don't see how moving to containerd package would solve this problem of k8s relying on a package outside of the k8s repo (and on contributors who may not have k8s as their priority/focus)

Agree. Not sure why this would resolve the dependency issue.

(cc @samuelkarp for the suggestion on relying on containerd library)

yujuhong avatar Oct 17 '24 18:10 yujuhong

Adding @kolyshkin as we have discussed this before.

mrunalp avatar Oct 17 '24 19:10 mrunalp

I personally think it's better to pull the public libcontainer pieces out of runc and have those versioned separately from runc, so we don't need to wait for runc verison bumps to have a tagged version of libcontainer.

Not necessarily disagreeing, but ... who will maintain these? This may be duplicating effort given https://github.com/containerd/cgroups is already being maintained for public import as a standalone library.

My understanding is the runc team is actively attempting to stop supporting reuse of this code, so we'd need someone else to maintain the fork that re-exports them. If containerd/cgroups already accomplishes this then ... (e.g. https://github.com/containerd/cgroups/commit/6f4af8b641e84180ca1408fdf676373a4d90844b)

in both cases aren't we signing up a different project to support kubernetes and our needs/timelines? Obviously there's some overlap between containerd community and k8s but I don't see how moving to containerd package would solve this problem of k8s relying on a package outside of the k8s repo (and on contributors who may not have k8s as their priority/focus)

Our needs: yes, to an extent, but the difference is containerd/cgroups is a seperate public library instead of some packages from runc's implementation. It already has a distinct version (currently v3.0.3) and timeline from containerd/containerd.

cc @thaJeztah as a recently prolific committer to containerd/cgroups.

BenTheElder avatar Oct 17 '24 19:10 BenTheElder

@BenTheElder yes, discussions have started in containerd project as well if they want to scope up or down to include what we may need.

dims avatar Oct 17 '24 22:10 dims

Here's another option!

  • Extract only files we use from runc : https://github.com/dims/libcontainer/
  • update cadvisor with above: https://github.com/dims/cadvisor/tree/try-dims-libcontainer-repo
  • then update k/k with new cadvisor and that library - https://github.com/kubernetes/kubernetes/pull/128200/files

We drop 20K lines of code and a bunch of tangled dependencies ... but then we own the library!

dims avatar Oct 19 '24 18:10 dims

I like this approach :) @kolyshkin @AkihiroSuda @cyphar @lifubang @rata WDYT

haircommander avatar Oct 21 '24 13:10 haircommander

@haircommander thanks for tagging us! I've just asked in the runc issue for opinions to other maintainers (https://github.com/opencontainers/runc/issues/3028#issuecomment-2426882847). Ideally, we should agree something and come-back with an opinion from the project (I'd like that, at least :)).

Just to understand, what is the context of this issue? Why do you want to remove the dependency on runc?

rata avatar Oct 21 '24 14:10 rata

The use of runc as a binary vs runc as a library has been a journey, in addition to the above linked issues, please see those listed below as well. If you look at issues/PRs in opencontainers/runc you will see more of the history. So far the burden was put on runc folks by k8s without discussion. Here we are trying to see if there is a better way going forward.

  • https://github.com/opencontainers/runc/issues/2471#issuecomment-1012724651
  • https://github.com/opencontainers/runc/issues/3849#issuecomment-1544519250
  • https://github.com/opencontainers/runc/pull/4436#issuecomment-2405829977

dims avatar Oct 21 '24 15:10 dims

personally a motivating factor is decoupling new features in the underlying libraries from the release cadence of the runc binary. SIG node has been waiting for 1.2.0 for about a year to get PSI metrics. If we could factor out libcontainer from runc then we could get access to those new features without requiring runc also have a release

haircommander avatar Oct 21 '24 15:10 haircommander

I am neutral as to whether kubernetes should switch to containerd/cgroups.

As for runc/libcontainer, let's take a look. Kubernetes only uses a few packages from github.com/opencontainers/runc/libcontainer, being:

  • cgroups (a handful of packages for fs-based and systemd-based cgroup v1 and v2 drivers);
  • configs (needed for cgroups above);
  • userns (should switch to github.com/moby/sys/userns; opened https://github.com/kubernetes/kubernetes/pull/128237);
  • apparmor (the only function used is IsEnabled, which is rather trivial);
  • utils (the only function used is CleanPath, which is rather trivial).

So, we're (mostly) talking about libcontainer/cgroups and libcontainer/configs here.

I am very much against forking of libcontainer, here's why. Making a fork is easy, maintaining a fork is hard (except when the original code is no longer maintained, which is not the case here). We'll need a team to port fixes from the original to the fork, and a team to port fixes from the fork to the original. We'll end up with two packages with two sets of bugs as a result. I doubt we (as a community) have resources for that.

What I think we should do is move libcontainer/cgroups out of runc and into a separate repo (under opencontainers I guess) which will have maintainers from both runc and kubernetes, and which will be consumed by both projects. This is problematic because of libcontainers/configs dependency, so the first step should be moving cgroup-related data structures from libcontainer/configs to libcontainers/cgroups. From the first glance, this is doable.

(The other problem was decoupling device management (needed by runc, not needed by other users) from cgroups managers, but this is already solved).

So, steps:

  1. Make runc/libcontainer/cgroups independent of runc/libcontainer/configs (and other runc stuff).
  2. Open a proposal to create cgroups repo under opencontainers org.
  3. Move github.com/opencontainers/runc/libcontainer/cgroups to github.com/opencontainers/cgroups.
  4. Use the above from runc and kubernetes.

kolyshkin avatar Oct 21 '24 18:10 kolyshkin

I'd like to hear what people think and, if there's an agreement, file a bug to runc and start working on items in the list.

kolyshkin avatar Oct 21 '24 18:10 kolyshkin

I like this plan! how would this work for you @dims

haircommander avatar Oct 21 '24 19:10 haircommander

@haircommander @kolyshkin 😍 🫶 Love the plan. ➕1️⃣ from me for opencontainers/cgroups

dims avatar Oct 21 '24 20:10 dims

@kolyshkin :+1: to that!

mrunalp avatar Oct 21 '24 20:10 mrunalp

I am very much against forking of libcontainer, here's why. Making a fork is easy, maintaining a fork is hard (except when the original code is no longer maintained, which is not the case here). We'll need a team to port fixes from the original to the fork, and a team to port fixes from the fork to the original. We'll end up with two packages with two sets of bugs as a result. I doubt we (as a community) have resources for that.

Very much "+1" on that. Some of these packages capture a lot of expertise and knowledge gathered over the years, and creating a fork will be risky (things will diverge, and subtle changes in behavior will creep in). The devil may be in the detail for some of these packages, as I know there's been many esoteric "fun" issues that have been fixed that are not trivial to discover (things "seem to work" until they don't).

To some extent, the containerd/cgroups package is already an example of this; it was created to provide a "fresh" take on handling cgroups and to provide a cleaner API, but while runc's code definitely was getting a bit old in the tooth, it's also battle-tested. The runc code most definitely has had more attention to fixes for (obscure) corner-cases, where the containerd one sometimes fell behind.

My ideal would still be if somehow those projects could converge, so that effort could be shared (and there would be no need to choose between a "fresher" API or rougher API, but more battle-tested).

As to https://github.com/opencontainers/runc/issues/3028, I think a main motivator there is that runc (being a Go project from "the old days") has been too permissive on exporting things; too many things that were ultimately for "internal use" were exported, and thus a public API; not always well-architected for this purpose, which now means that any change needed for runc itself now could potentially break other projects. I've been "chipping away" smaller packages that were reasonably self-contained and stable to separate modules (github.com/moby/sys/user, github.com/moby/sys/userns). I guess the cgroups package could be considered as well; assuming we can get the public API "mostly stable", and if it doesn't complicate testing and integration back into runc too much.

thaJeztah avatar Oct 21 '24 23:10 thaJeztah

+1 @kolyshkin, thank you! 🙏

I am very much against forking of libcontainer, here's why. Making a fork is easy, maintaining a fork is hard (except when the original code is no longer maintained, which is not the case here). We'll need a team to port fixes from the original to the fork, and a team to port fixes from the fork to the original. We'll end up with two packages with two sets of bugs as a result. I doubt we (as a community) have resources for that.

💯

As to https://github.com/opencontainers/runc/issues/3028, I think a main motivator there is that runc (being a Go project from "the old days") has been too permissive on exporting things; too many things that were ultimately for "internal use" were exported, and thus a public API; not always well-architected for this purpose, which now means that any change needed for runc itself now could potentially break other projects. I've been "chipping away" smaller packages that were reasonably self-contained and stable to separate modules (github.com/moby/sys/user, github.com/moby/sys/userns). I guess the cgroups package could be considered as well; assuming we can get the public API "mostly stable", and if it doesn't complicate testing and integration back into runc too much.

Kubernetes also has this problem, we've been telling people importing this repo (as opposed to the libraries "staged" out of it or developed externally outright) is ~unsupported, but haven't moved towards inernal/ yet. Personally I think communicating with internal/ is a good idea and if someone really wants that code they can fork ... but huge +1 to Kubernetes not doing that [edit: forking] and instead working with runc and the ecosystem.

BenTheElder avatar Oct 22 '24 00:10 BenTheElder

I like the proposal and don't have any concerns. We discussed this at today's SIG Node weekly too, and no concerns raised from the meeting. Peter, representing SIG Node community, will send an email to kube-dev mailing list. cc/ @haircommander

dchen1107 avatar Oct 22 '24 18:10 dchen1107

Done in https://groups.google.com/g/kubernetes-sig-node/c/ce2yLXg5SDw :slightly_smiling_face:

haircommander avatar Oct 22 '24 18:10 haircommander

Just as a note here, @kolyshkin we'll probably need a new Project proposal for a factored-out cgroups library like we have previously done for the go-digest and selinux libraries. I'm happy to facilitate the TOB vote on the proposal once it's ready.

samuelkarp avatar Oct 22 '24 18:10 samuelkarp

FYI, doing some searches to figure out if we can totally drop containerd/cgroups from k/k and found that we will need some updates on the runc side:

https://github.com/opencontainers/runc/pull/4497

dims avatar Oct 31 '24 14:10 dims

Now we wait for https://github.com/opencontainers/runc/pull/4472 to land first and then broken into a separate go.mod or a standalone library.

dims avatar Nov 05 '24 12:11 dims

Just as a note here, @kolyshkin we'll probably need a new Project proposal for a factored-out cgroups library like we have previously done for the go-digest and selinux libraries. I'm happy to facilitate the TOB vote on the proposal once it's ready.

Opened https://github.com/opencontainers/tob/pull/144

kolyshkin avatar Nov 07 '24 00:11 kolyshkin

xref: https://github.com/opencontainers/runc/pull/4472

dims avatar Dec 05 '24 15:12 dims

next up is: https://github.com/opencontainers/runc/pull/4577

dims avatar Jan 14 '25 21:01 dims

PR is open to populate opencontainers/cgroups: https://github.com/opencontainers/cgroups/pull/1

dims avatar Feb 28 '25 02:02 dims

/close

fixed in https://github.com/kubernetes/kubernetes/pull/130569

dims avatar Mar 10 '25 15:03 dims

@dims: Closing this issue.

In response to this:

/close

fixed in https://github.com/kubernetes/kubernetes/pull/130569

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Mar 10 '25 15:03 k8s-ci-robot