eve icon indicating copy to clipboard operation
eve copied to clipboard

[real-time] cgroups v2 support implementation

Open rucoder opened this issue 3 months ago • 7 comments

Description

NOTE: this PR is made to real-time branch!

this PR also contains some changes that won't be merged to master branch, they are just handy for debugging

  1. Task list
  • [x] integrate new runc and contanerd
  • [x] update containerd config
  • [x] update "user" containerd
  • [ ] update startup scripts
  • [x] test eve onboarding
  • [x] test app deployment
  • [x] add a script to debug container to audit cgroup v2

PR dependencies

This PR depends on kernel PR https://github.com/lf-edge/eve-kernel/pull/215

How to test and validate this PR

Changelog notes

PR Backports

- 14.5-stable: No, as the feature is not available there.
- 13.4-stable: No, as the feature is not available there.

Checklist

  • [ ] I've provided a proper description
  • [ ] I've added the proper documentation
  • [ ] I've tested my PR on amd64 device
  • [ ] I've tested my PR on arm64 device
  • [ ] I've written the test verification instructions
  • [ ] I've set the proper labels to this PR

And the last but not least:

  • [ ] I've checked the boxes above, or I've provided a good reason why I didn't check them.

Please, check the boxes above after submitting the PR in interactive mode.

rucoder avatar Sep 02 '25 22:09 rucoder

The change is huge. It should come with an extensive test plan to check for regressions in any component. I also don't understand why we need v2 and a specific RT branch in general. We tested RT before without any extra changes and it was ok.

@OhmSpectator because I have extra debugging commits and i do not plan to "extensively test" now , beside memory-monitor is commented and not yet updated

rucoder avatar Sep 03 '25 13:09 rucoder

I also don't understand why we need v2 and a specific RT branch in general. We tested RT before without any extra changes and it was ok.

I second part of this. I think we should go to containerd v2 and runc v1.3.0, and we should go to cgroups v2. I also think we should not defer because it is a big change; we do that a bit too much (not blaming anyone, I am as much responsible as the next person; it just is reality).

I do think we should have an RT branch if there are RT things that are needed and unique, even just for testing and evaluation. But these changes (other than RT-specific) should go into the master branch, and should go as soon as possible. Especially after @rucoder did all of the work to figure this out.

deitch avatar Sep 03 '25 15:09 deitch

I also don't understand why we need v2 and a specific RT branch in general. We tested RT before without any extra changes and it was ok.

I second part of this. I think we should go to containerd v2 and runc v1.3.0, and we should go to cgroups v2. I also think we should not defer because it is a big change; we do that a bit too much (not blaming anyone, I am as much responsible as the next person; it just is reality).

I do think we should have an RT branch if there are RT things that are needed and unique, even just for testing and evaluation. But these changes (other than RT-specific) should go into the master branch, and should go as soon as possible. Especially after @rucoder did all of the work to figure this out.

I do not disagree. I will post a PR to master when all components are converted to cgroup v2. For now memory monitor is not so we cannot make that PR but for RT memory-monitor is not an essential part as of now

rucoder avatar Sep 03 '25 16:09 rucoder

The change is huge. It should come with an extensive test plan to check for regressions in any component. I also don't understand why we need v2 and a specific RT branch in general. We tested RT before without any extra changes and it was ok.

@OhmSpectator The goal for this time is to test RT with containers running on the host; not using VMs. The other goal for this time is to use the Intel cache allocation technology which assume cgroupv2. Thus for those reasons it makes sense to switch to cgroupv2.

Having said that, I'm a fan of getting the cgroupv2 into master, but we need the memory monitor changes to v2 before we do that. (Branches that are either long-lived and/or have lots of changes are a bad idea in general.)

eriknordmark avatar Sep 04 '25 15:09 eriknordmark

/rerun red

rucoder avatar Sep 08 '25 16:09 rucoder

as @OhmSpectator pointed out offline we need to check how consistent our memory settings are. Pillar may use kernel command line arguments to update memory limits, however the only source of truth is the value in config when we update memory limits and a file under /sys/fs/cgroups when we read a current limit.

rucoder avatar Sep 11 '25 08:09 rucoder

@eriknordmark tests failed because this branch uses old kernel on CI. I'm almost done with RT kernel and will integrate it soon

rucoder avatar Sep 17 '25 06:09 rucoder