runc
runc copied to clipboard
[Alpine] docker top, runc ps fail with cgroup2 with: unable to get all container pids
Description
docker top
and runc ps
fail with:
alpine:~$ docker top 09e847645eec
Error response from daemon: runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/09e847645eec8091d041c27b5ff969825b10155b60ca00230043c87764884135/cgroup.procs: operation not supported
: unknown
~ # runc --root /run/docker/runtime-runc/moby ps 09e847645eec8091d041c27b5ff969825b10155b60ca00230043c87764884135
ERRO[0000] unable to get all container pids: read /sys/fs/cgroup/docker/09e847645eec8091d041c27b5ff969825b10155b60ca00230043c87764884135/cgroup.procs: operation not supported
~ #
when the system has cgroup2 mounted as:
alpine:~$ mount|grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
alpine:~$
and this does not happen when cgroup v1 is mounted (in addition to, or instead of cgroup v2).
The issue was found on Alpine Edge with packages:
alpine:~$ apk list -I|grep -E 'runc|docker|containerd'|sort
containerd-1.7.7-r2 x86_64 {containerd} (Apache-2.0) [installed]
containerd-openrc-1.7.7-r2 x86_64 {containerd} (Apache-2.0) [installed]
docker-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
docker-cli-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
docker-cli-buildx-0.11.2-r3 x86_64 {docker-cli-buildx} (Apache-2.0) [installed]
docker-engine-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
docker-openrc-24.0.6-r4 x86_64 {docker} (Apache-2.0) [installed]
runc-1.1.9-r2 x86_64 {runc} (Apache-2.0) [installed]
alpine:~$
Alpine uses openrc, which allows to specify the cgroup mount strategy in /etc/rc.conf
:
# This sets the mode used to mount cgroups.
# "hybrid" mounts cgroups version 2 on /sys/fs/cgroup/unified and
# cgroups version 1 on /sys/fs/cgroup.
# "legacy" mounts cgroups version 1 on /sys/fs/cgroup
# "unified" mounts cgroups version 2 on /sys/fs/cgroup
#rc_cgroup_mode="unified"
and the issue mentioned above is observed when rc_cgroup_mode is unified:
alpine:~$ mount|grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
alpine:~$
and is not observed when it's legacy:
alpine:~$ mount|grep cgroup
cgroup_root on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,inode64)
openrc on /sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
cpuset on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
alpine:~$
or hybrid:
alpine:~$ mount|grep cgroup
cgroup_root on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755,inode64)
openrc on /sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc)
none on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cpuset on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cpu on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)
cpuacct on /sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)
blkio on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
memory on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
devices on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
freezer on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
net_cls on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
perf_event on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
net_prio on /sys/fs/cgroup/net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio)
hugetlb on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
pids on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
alpine:~$
Steps to reproduce the issue
- Start any container with
docker run -it --rm <any container>
- execute
docker top <container id>
orrunc --root /run/docker/runtime-runc/moby ps <container id>
Describe the results you received and expected
The command should display a list of processes in the container.
What version of runc are you using?
runc version 1.1.9 commit: 82f18fe0e44a59034f3e1f45e475fa5636e539aa spec: 1.0.2-dev go: go1.21.3 libseccomp: 2.5.4
Host OS information
NAME="Alpine Linux" ID=alpine VERSION_ID=3.19_alpha20230901 PRETTY_NAME="Alpine Linux edge" HOME_URL="https://alpinelinux.org/" BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
Host kernel information
Linux alpine 6.1.59-0-lts #1-Alpine SMP PREEMPT_DYNAMIC Fri, 20 Oct 2023 06:24:46 +0000 x86_64 Linux
The issue is reproducible with runc taken from the main git branch.
@kholmanskikh can you please check and confirm/deny that this is because of nsdelegate
option to cgroupv2 mount?
The issue is also reproducible when the cgroup2 is mounted without the nsdelegate
option:
alpine:~$ mount|grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
alpine:~$ docker run --rm -it -d alpine
2babd8f8f743beea96d6f2fba02de19036e0f734d8c1d249ac694b8ad501f0e6
alpine:~$ docker top 2babd8f8f743beea96d6f2fba02de19036e0f734d8c1d249ac694b8ad501f0e6
Error response from daemon: runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/2babd8f8f743beea96d6f2fba02de19036e0f734d8c1d249ac694b8ad501f0e6/cgroup.procs: operation not supported
: unknown
alpine:~$
related downstream issues:
- https://gitlab.alpinelinux.org/alpine/aports/-/issues/15506
- https://gitlab.alpinelinux.org/alpine/aports/-/issues/15570
It also fails to start containers with --memory
option:
$ docker run --rm -it --memory 2G alpine
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: cannot enter cgroupv2 "/sys/fs/cgroup/docker" with domain controllers -- it is in domain threaded mode: unknown.
In this case I have a daemon.json
:
{
"storage-driver": "overlay2",
"cgroup-parent": "/docker"
}
EDIT: but if I use:
{
"cgroup-parent": "/dockerContainers"
}
It actually works.
Could it be that runc
sets docker/cgroup.type
to domain threaded
?
If I restart the docker daemon, it will initially be domain
, but after first run container it changes to domain threaded
:
ncopa-desktop:~$ doas /etc/init.d/docker start
* Starting Docker Daemon ... [ ok ]
ncopa-desktop:~$ cat /sys/fs/cgroup/docker/cgroup.type
domain
ncopa-desktop:~$ docker run --rm alpine echo hello
hello
ncopa-desktop:~$ cat /sys/fs/cgroup/docker/cgroup.type
domain threaded
Why does it end up with setting cgroup type as domain threaded
?
I found out that docker itself does not create /sys/fs/cgroup/docker
. It is openrc
that creates this.
It seems that also docker's default cgroup-parent
also is docker
. I think what happens here is that docker and openrc are stepping on each others toes.
Hi, i have the same issue under Portainer. I installer Alpine linux x64 and when i want to look at container stats in Portainer, i get the following error
"runc did not terminate successfully: exit status 1: unable to get all container pids: read /sys/fs/cgroup/docker/c7fe07c5253dba763ce8fde71945c3a5ac32998ae50dc1345dba7cffd6fab5fa/cgroup.procs: operation not supported: unknown"
I have many containers running fine for a while now but i'm unable to get stats