benchexec
benchexec copied to clipboard
Support for cgroup v2
Linux 4.5 has added version 2 of cgroup (documentation). Currently it is not ready (cpu controller is only documented but not yet merged, no sign of cpuset and freezer controllers), but when it is usable, BenchExec will need several adjustments to be able to work with cgroup-v2:
cgroup.procsneeds to be used instead oftasks(with granularity of processes instead of threads, but this is no problem for us).- Children of current cgroup cannot be used while current cgroup contains processes: BenchExec needs to create a child cgroup for itself and move all processes from current cgroup there before creating cgroups for runs, or get as parameter a designated empty benchmarking cgroup.
- Controllers need to be enabled for each child cgroup in
cgroup.subtree_controlfile, if we do not want to force the user to do this manually. /proc/<PID>/cgroupdoes not show controller name for cgroup-v2 hierarchy.- There can be multiple v1 hierarchies and a v2 hierarchy, with controllers split between them (not sure if we want to support this).
cpuacctcontroller has been merged intocpucontroller.- Files for reading measurements and writing limits have changed.
- Notification about OOM events needs to be done via the
memory.eventsfile. - It might be worth adding memory protection (i.e., memory loosely "guaranteed" for a cgroup) with the
memory.lowfile. - Reading peak memory usage currently does not seem to be supported.
Once we use cgroup-v2, we can use the cgroup namespace to restrict the cgroups the processes can see: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d4021f6cd41f03017f831b3d40b0067bed54893d
There is a nice overview of the current state and information about cgroup-v2 in these slides. It claims that cpuset and cpu were added, but freezer is still missing.
We do not need freezer in container mode, so we can really start implementing support for cgroup v2 now.
Suggested order of steps:
- Implement #173 (we need a separate subgroup for the BenchExec main process anyway for cgroup v2).
- Add support for cgroup v2, keeping the existing support for cgroup v1.
- Test that everything works and performance is comparable.
- Implement #436 (which should be finally fully doable and is a great feature).
- Investigate whether there are new features of cgroup v2 that would make sense to use.
Hi, what is the status of this? The last comment is more than a year old and I am wondering if cgroups v2 will work soon as they have important advantages for admins. We at ParaDiSe (DIVINE) would like to run our machines with v2 enabled, but no support for benchexec would complicate that.
Nothing has happened since then due to lack of manpower and because AFAIK most distributions still use hybrid mode with all controllers in cgroups-v1 hierarchies by default anyway. But of course the plan is still to implement this. Would you by chance like to contribute (parts of) this feature?
As far as I know, Archlinux and Fedora 31+ already use v2 hierarchy by default (actually it should be default since Systemd v248). So its importance will increase.
As for the contributions, sadly my time is limited now and I can't help in any reasonable time, furthermore, I am no longer active maintarner of DIVINE so I have not direct need for benchexec, I just manage the machines it runs on (as do some other projects, some of which need cgroups v2).
I wonder if this project will have a cgroup v2 version or if it is going on. I would like to contribute to the cgroup v2 version if you have such a plan.
Yes, a v2 version will absolutely exist, there was never any question about it. Since recently we have someone working on this.
Is there a rough ETA on this?
With the release of Debian 11, they default to cgroups v2 and our infrastructure no longer runs benchexec.
There seems to be a workaround using systemd.unified_cgroup_hierarchy=0, but this interferes with Docker.
@globin is actively working on this and the ETA is some time in the next weeks or (few) months.
I am surprised to hear that Docker should have problems with cgroups v1. In fact, Docker only gained support for cgroups v2 last December(!) and was the main reason for large parts of the Linux world (including Debian) to not adopt cgroups v2 already years ago. And of course, current versions of Docker still work on Ubuntu 20.04, which has cgroups v1. So I suspect that any such problems should be solvable.
@PhilippWendler, if you suggest using other distribution as a solution, that is not acceptable in many cases. For example, we cannot have dedicated machines for SV-COMP – we need to balance needs of different groups. In our case, we also need to run Podman (a Docker alternative), which actually requires cgroups v2 for rootless mode.
A heads-up warning for everyone waiting on cgroups-v2: While implementing this in BenchExec, we noticed that the kernel does not provide the required memory measurements with cgroups-v2. The list of files for the memory controller provides only measurements for current memory consumption, but not for peak memory consumption, what we need. With cgroups-v1 this is supported (memory.memsw.max_usage_in_bytes).
This won't change our plans of adding support for cgroups-v2 to BenchExec, but will mean that you would probably not get memory measurements while using it until the kernel implements this (in case we overlooked some way to get this, please tell us).
The memory limit is still expected to work.
To anyone waiting for this: We have a draft PR for cgroupsv2 support (#791) and basic functionality is working. You are invited to test this and we welcome feedback in the PR comments.
Note that as described above, memory measurements are not going to work.
Thanks for the ongoing work on this!
After my upgrade to Ubuntu 22 I hit the issue here due to ubuntu now defaulting to cgroups v2. I've just reverted back to v1, following https://askubuntu.com/a/19487/503686, by adding systemd.unified_cgroup_hierarchy=0 to the GRUB_CMDLINE_LINUX_DEFAULT in my system's /etc/default/grub. So far, I can run benchecex again and nothing is breaking :)
Thought I'd document the easy workaround for future seekers.
The list of files for the memory controller provides only measurements for current memory consumption, but not for peak memory consumption, what we need. With cgroups-v1 this is supported (
memory.memsw.max_usage_in_bytes).This won't change our plans of adding support for cgroups-v2 to BenchExec, but will mean that you would probably not get memory measurements while using it until the kernel implements this (in case we overlooked some way to get this, please tell us).
I don't really know anything about anything about cgroups other than coming here after upgrading to Ubuntu 22.04, but isn't memory.peak from there just that?
Yes. memory.peak is a really recent feature from Linux 5.19 and newer than the comment that you quoted. :-)
We will add support for it, but because it is so new this will also mean that many users will at first not be able to use it (for example the kernel on Ubuntu 22.04 is older and does not support it).
To everyone who is following this: We now have an implementation that should be usable and close to being merged. However, we need people to test it on their systems and documented their experience. Please see #791 for more information. Your help is greatly appreciated!