cgroup-parent puts BuildKit builds in the wrong cgroup
Contributing guidelines
- [x] I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- [x] ... the documentation does not mention anything about my problem
- [x] ... there are no open or closed issues that are related to my problem
Description
When cgroup-parent is set in /etc/docker/daemon.json, it puts BuildKit builds in the wrong parent cgroup, so any applicable limits don't take effect.
Expected behaviour
BuildKit builds should get put in the same parent cgroup as ordinary containers and non-BuildKit builds.
Actual behaviour
BuildKit builds get put in a slightly different parent cgroup than ordinary containers and non-BuildKit builds.
Buildx version
github.com/docker/buildx v0.19.3 48d6a39
Docker info
Client: Docker Engine - Community
Version: v27.4.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.19.3
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.32.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: v27.4.1
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 88bf19b2105c8b17560993bee28a01ddc2f97182
runc version: v1.2.2-0-g7cb3632
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.14.0-503.21.1.el9_5.ppc64le
Operating System: Red Hat Enterprise Linux 9.5 (Plow)
OSType: linux
Architecture: ppc64le
CPUs: 144
Total Memory: 123.6GiB
Name: <redacted>
ID: 28e41ca5-866a-47aa-a1ee-e4b2e5bf109e
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Builders list
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default* docker
\_ default \_ default running v0.17.3 linux/ppc64le
Configuration
/etc/systemd/system/docker_limit.slice:
[Unit]
Description=Slice that limits docker resources
Before=slices.target
[Slice]
MemoryAccounting=true
MemoryLimit=64G
/etc/docker/daemon.json:
{
"cgroup-parent": "docker_limit.slice"
}
Dockerfile:
FROM gcc
COPY ./use100gb.c .
RUN gcc use100gb.c -o use100gb && ./use100gb
use100gb.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define GB 1024*1024*1024
int main(void) {
for(int i = 1; i <= 100; ++i) {
void *buf = malloc(GB);
if(!buf) break;
memset(buf, 'x', GB);
printf("%d\n", i);
}
puts("Press Ctrl+C to exit");
for(;;) sleep(1);
}
$ DOCKER_BUILDKIT=0 docker build . # gets OOM-killed, correctly
$ docker build . # takes up 100GB of memory, incorrectly
Build logs
Additional info
For ordinary containers and non-BuildKit builds, /proc/PID/cgroup contains something like this (correct):
0::/docker_limit.slice/docker-b9ba399c60c1a2407001ee90ee3307ee3104e2f1c1db25ec6337ab298fe7518f.scope
For BuildKit builds, /proc/PID/cgroup contains something like this (incorrect):
0::/system.slice/docker_limit.slice:docker:k4xghul4df75u16fk09bfyizr
/etc/docker/daemon.json:{ "cgroup-parent": "docker_limit.slice" }
I don't think it reads cgroup-parent from dockerd config looking at: https://github.com/docker/buildx/blob/16edf5d4aa76ef9978c06889d66249aeab5729fe/driver/docker-container/driver.go#L160-L180
Does it work if you set it in build command?:
docker build --cgroup-parent=docker_limit.slice .
I don't think it reads cgroup-parent from dockerd config looking at:
It's weird then that the incorrect cgroup does still contain the string docker_limit.slice in its name.
Does it work if you set it in build command?:
docker build --cgroup-parent=docker_limit.slice .
Yes, that works. So my goal is just to make that be the default for all docker builds.
I'm also encountering the same issue.
@crazy-max I assume the code you linked is only reached in combination with some other options: When removing /etc/docker/daemon.json completely the build processes are not running in the /docker/buildx slice as the comment would imply, but rather in the top-level system.slice. This is with docker 27.1.2.