void-runit
void-runit copied to clipboard
pseudofs: cgroup: various improvements
Default to cgroup2 Attempt to mount cgroup2 in containers to support nesting Mount the systemd tracking cgroup when using cgroup1
Closes https://github.com/void-linux/void-runit/issues/74
Former title: rough thoughts on cgroup1 and cgroup2
Former description:
cgroup1
the various container tool run scripts variably handle certain cases, namely that of the systemd tracking cgroup.
i am actually not 100% confident on why that cgroup needs to be mounted, but i imagine it has to do with the needs of systemd instances that may be running in containers.
anyway supposedly it needed to be mounted at one point so the run scripts would do it. but if the run scripts did it, it could not have been that harmful. so just do it always. (in legacy/hybrid mode, which I have not been using for some time).
cgroup2
alright so while we are at it default to pure-cgroup2 / "unified". i don't know why anyone would want to use a hybrid. and i own a phev. cgroup2 is just a better default. more compatible and future proof.
to top it off, start mounting cgroup2 when running in a container. because LXD can not or will not do that for us, the container. this actually ignores rc.conf completely. might need some work
@CameronNemo thanks for this. I think I'm tracking what's going on here, and the general approach looks like it would solve the problems with running docker inside a linux container.
~~so I've confirmed that setting CGROUP_MODE=unified
in /etc/rc.conf
will fix the docker launch issue.~~
I take it back - this didn't seem to do the trick post-reboot. docker ps
works but no containers can be started. You still need the mount -t cgroup2 cgroup2 /sys/fs/cgroup/
in /etc/rc.local
to make it work.
@sbromberger one difference I see is that void-runit is using the nsdelegate option, and you do not appear to be doing so.
one difference I see is that void-runit is using the nsdelegate option, and you do not appear to be doing so.
Sorry - a bit confused here. Where would I have made that change?
(Oh - you mean my explicit mount in /etc/rc.local
- gotcha. I'm not familiar with nsdelegate
.)
@sbromberger See https://man.voidlinux.org/cgroups#CGROUPS_DELEGATION:_DELEGATING_A_HIERARCHY_TO_A_LESS_PRIVILEGED_USER
The nsdelegate mount option only has an effect when performed in the initial mount namespace; in other mount namespaces, the option is silently ignored.
Not sure why specifying it would fail. Should be ignored in your case. Maybe try that option to see if it changes anything. Otherwise my patch here is probably broken for some reason.
Just to be clear - I haven't tried this PR yet. It's on my list.
OK. With these changes and with the /etc/sv/docker/run
file whittled down to just exec chpst -o 1048576 -p 1048576 dockerd $OPTS 2>&1
, docker comes up properly after a reboot, and images run properly.
@leahneukirchen any thoughts on any of these changes?
- default to cgroup2
- try to mount cgroup2 in containers but ignore any errors
- mount the systemd tracking cgroup when using cgroup1
@sbromberger I made some changes, mainly just cleaning it up but basically doing the same thing. If you want to retest with the most recent changes, that would be helpful.
Sorry for the delay. This, along with commenting out all but the last line in /etc/sv/docker/run, allows docker to run within containers with no issues whatsoever.
Thanks, and Happy New Year!