Reproducable: bake creates un-prunable files in overlay2
Contributing guidelines
- [X] I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- [X] ... the documentation does not mention anything about my problem
- [ ] ... there are no open or closed issues that are related to my problem
Description
Up front: Sorry for the wall of text, this problem seems to be very elusive...
One of our docker hosts frequently ran out of disk space, despite pruning containers, logs, images, volumes and build cache.
The directory "/var/lib/docker/overlay2" often used about 100GB, while:
docker ps -ashowed nothingdocker volume lsshowed nothingdocker images -ashowed nothingdocker system dfshowed 0 bytes being used
For pruning we ran:
docker system prune -a -f --volumes
docker builder prune -a -f
docker buildx prune -a -f
Whenever this situation occurred, the only thing we could do was uninstalling docker, removing /var/lib/docker and reinstalling it...
apt remove docker-ce
rm -rf /var/lib/docker
apt install docker-ce
I did not tick "there are no open or closed issues..." because there are several issues / forum threads out there along the lines of "/var/lib/docker/overlay2 fills up my disk". Usually these come down to the author not knowing about the docker (builder) prune commands or all of the additional flags that can be set for these commands.
I am quite confident that with the prune commands I mentioned above, /var/lib/docker/overlay2 should be empty. Matter of fact it ends up empty, except when using docker buildx bake with very specific inputs that we managed to pin down to a very small, reproducible setup...
Expected behaviour
I generally expect to be able to "free up" disk-space consumed by any docker resource through the docker-cli without having to reinstall docker, please let me know if this is somehow misguided.
More specifically after deleting all containers, images, volumes and pruning the build cache, I expect /var/lib/docker/overlay2 to be empty. (except for the folder called l.) To "empty" docker, we typically used docker system prune -a -f --volumes, docker builder prune, docker buildx prune.
Actual behaviour
When using the dockerfiles and docker-bake.hcl provided below, buildx produces files in /var/lib/docker/overlay2 which we seem to have no way of deleting through the cli and which are also not counted when running docker system df.
We have reduced our original dockerfiles and docker-bake.hcl as far as we could until simplifying further makes the problem disappear. Whenever we made any modification we "reset" our docker installation by uninstalling, removing /var/lib/docker and reinstalling.
The steps below were reproducible on two different machines:
Step1: Make sure your docker setup is "empty"
$ /var/lib/docker/overlay2 ls
l
$ /var/lib/docker/overlay2 du -hs .
8.0K .
$ /var/lib/docker/overlay2 docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 0 0 0B 0B
Containers 0 0 0B 0B
Local Volumes 0 0 0B 0B
Build Cache 0 0 0B 0B
Step 2: Create dockerfiles and docker-bake.hcl
In a directory of your choice, create the files I have provided in section "Configuration".
Step 3: Build the docker images with docker buildx bake
$ ~/c/docker-garbage docker buildx bake
< Logs provided in section "Build logs" below. >
Step 4: Checking disk usage of docker and /var/lib/docker/overlay2:
$ ~/c/docker-garbage docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 2 0 1.602GB 1.602GB (100%)
Containers 0 0 0B 0B
Local Volumes 0 0 0B 0B
Build Cache 8 0 355B 355B
We are not running any containers at the moment. mount shows no bind mounts in overlay2.
$ /var/lib/docker/overlay2 ls
602e8d236dc20cc7a5b8c6e8eee69c1a9757f8541ba9a50682d6b4762ce80150 f6s5n1yu2asw6rkr6qc87j64k ndxiwzkjek6dke51shxtcfhrv
98th2ukd5n6fz7hfn2l6er2bq hoc4down9mqimid31ttmrbau7 r2q82nycxvnuym2htxvrcsiab
ac60195faf150523e49f0adcd9e3a306075bc7f51c053f6b4962dd70935fbef0 l ytixoxoym50muwljx6e8oobcq
$ /var/lib/docker/overlay2 du -hs .
1.7G .
Step 5: Prune docker
Notice that less than the "reclaimable" amount is reclaimed.
$ ~/c/docker-garbage docker system prune -a -f --volumes
Deleted Images:
untagged: child:latest
deleted: sha256:78f42581308f5aaa74e133eef3caaa68c74f3aab76dcb820d365fa222213a046
untagged: parent:latest
deleted: sha256:b400bd392d4f0b10d4b01fc4a978f59073d0d58c2b56a005c1ad77bf23291ce4
Deleted build cache objects:
r2q82nycxvnuym2htxvrcsiab
ig1fym5s2s41ov8lnbojhqi86
f6s5n1yu2asw6rkr6qc87j64k
98th2ukd5n6fz7hfn2l6er2bq
re3yl384a1hbepqwrl8k9p251
ytixoxoym50muwljx6e8oobcq
vk7y6vkww45ma28i321trsfrm
elqgvrx10rufna56k4bm10gc8
Total reclaimed space: 1.485GB
And the builder cache pretends to be empty!
$ ~/c/docker-garbage docker builder prune -a -f
Total: 0B
$ ~/c/docker-garbage docker buildx prune -a -f
Total: 0B
Step 6: Check disk usage again
Docker system, as expected shows 0B being used:
$ ~/c/docker-garbage docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 0 0 0B 0B
Containers 0 0 0B 0B
Local Volumes 0 0 0B 0B
Build Cache 0 0 0B 0B
Other docker commands also show no resources:
$ ~/c/docker-garbage docker volume ls ✘ 125
DRIVER VOLUME NAME
$ ~/c/docker-garbage docker images -a
REPOSITORY TAG IMAGE ID CREATED SIZE
$ ~/c/docker-garbage docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
We are still not running any containers at the moment. mount shows no bind mounts in overlay2.
But when looking at overlay2, 1.7GB disk are being used.
$ /var/lib/docker/overlay2 du -hs .
1.7G .
$ /var/lib/docker/overlay2 ls
602e8d236dc20cc7a5b8c6e8eee69c1a9757f8541ba9a50682d6b4762ce80150 hoc4down9mqimid31ttmrbau7 l
I have played around with this setup a little more (changed FROM and RUN commands, etc.) Whenever something made the issue disappear, I left a comment in the dockerfile or docker-bake.hcl.
Why I believe it is a problem with bake...
When using:
docker buildx build -t child -f child.dockerfile . and docker buildx build -t parent -f parent.dockerfile . to build these images instead of bake and then pruning, no files are left behind in overlay2!
Buildx seems to 'know' about these files...
Another interesting observation is that when I remove these folders without also reinstalling docker and then try to build again, the build fails because the directories are missing! So it seems that buildx is somehow aware of these files, it just doesn't prune them! And they are also not counted in system df...
$ /var/lib/docker/overlay2 rm -rf 602e8d236dc20cc7a5b8c6e8eee69c1a9757f8541ba9a50682d6b4762ce80150/ hoc4down9mqimid31ttmrbau7/
$ ~/c/docker-garbage docker buildx bake
[+] Building 9.4s (9/9) FINISHED docker:default
=> [parent internal] load build definition from parent.dockerfile 0.0s
=> => transferring dockerfile: 314B 0.0s
=> [parent internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> CANCELED [child] resolve image config for docker.io/docker/dockerfile:1 2.4s
=> [parent] docker-image://docker.io/docker/dockerfile:1@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 4.3s
=> => resolve docker.io/docker/dockerfile:1@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 0.0s
=> => sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 8.40kB / 8.40kB 0.0s
=> => sha256:657fcc512c7369f4cb3d94ea329150f8daf626bc838b1a1e81f1834c73ecc77e 482B / 482B 0.0s
=> => sha256:a17ee7fff8f5e97b974f5b48f51647d2cf28d543f2aa6c11aaa0ea431b44bb89 1.27kB / 1.27kB 0.0s
=> => sha256:9d9c93f4b00be908ab694a4df732570bced3b8a96b7515d70ff93402179ad232 11.80MB / 11.80MB 4.0s
=> => extracting sha256:9d9c93f4b00be908ab694a4df732570bced3b8a96b7515d70ff93402179ad232 0.2s
=> [parent internal] load metadata for docker.io/library/debian:bookworm 2.4s
=> [parent 1/2] FROM docker.io/library/debian:bookworm@sha256:eaace54a93d7b69c7c52bb8ddf9b3fcba0c106a497bc1fdbb89a6299cf945c63 0.1s
=> => resolve docker.io/library/debian:bookworm@sha256:eaace54a93d7b69c7c52bb8ddf9b3fcba0c106a497bc1fdbb89a6299cf945c63 0.0s
=> => sha256:eaace54a93d7b69c7c52bb8ddf9b3fcba0c106a497bc1fdbb89a6299cf945c63 1.85kB / 1.85kB 0.0s
=> => sha256:8a6e23e1b192b30eff14036a92e9ecdb551a1a10aa8535728b0c13d14d8c9462 529B / 529B 0.0s
=> => sha256:2657a4a0a6d5e8b3515004185275768f115a64a833de40125bb3f6b0b8cc598b 1.46kB / 1.46kB 0.0s
=> [child internal] load build definition from child.dockerfile 0.1s
=> => transferring dockerfile: 130B 0.0s
=> [child internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> ERROR [parent 2/2] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y texlive-latex-extra 0.0s
------
> [parent 2/2] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y texlive-latex-extra:
------
parent.dockerfile:8
--------------------
6 |
7 | # If you install "tree", instead of texlive, the issue disappears
8 | >>> RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y texlive-latex-extra
9 |
10 |
--------------------
ERROR: failed to solve: failed to prepare sha256:7c85cfa30cb11b7606c0ee84c713a8f6c9faad7cb7ba92f1f33ba36d4731cc82 as epqo0uud3cje68ljmkwpp1ksb: open /var/lib/docker/overlay2/602e8d236dc20cc7a5b8c6e8eee69c1a9757f8541ba9a50682d6b4762ce80150/committed: no such file or directory
Buildx version
github.com/docker/buildx 0.11.2 9872040b6626fb7d87ef7296fd5b832e8cc2ad17, github.com/docker/buildx v0.11.2 9872040
Docker info
# Machine 1:
Client:
Version: 24.0.5
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: 0.11.2
Path: /usr/lib/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: 2.20.3
Path: /usr/lib/docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: true
Native Overlay Diff: false
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 091922f03c2762540fd057fba91260237ff86acb.m
runc version:
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.5.5-arch1-1
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.13GiB
Name: ANUBIS
ID: 64789752-7389-4406-aebf-e0ee6f3a0a50
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
# Machine 2:
Client: Docker Engine - Community
Version: 24.0.5
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.2
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.20.2
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 24.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
runc version: v1.1.8-0-g82f18fe
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.1.0-11-amd64
Operating System: Debian GNU/Linux 12 (bookworm)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.823GiB
Name: masterhorst.cs.uni-saarland.de
ID: 8d4a754a-c18d-4728-bb92-6b7218e12de2
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Builders list
# Machine 1:
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default * docker
default default running v0.11.6+0a15675913b7 linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/amd64/v4, linux/386
# Machine 2:
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default * docker
default default running v0.11.6+0a15675913b7 linux/amd64, linux/amd64/v2, linux/386
Configuration
./parent.dockerfile
# syntax=docker/dockerfile:1
FROM debian:bookworm
# Also works with:
# FROM python:3.10
# FROM ubuntu:focal
# If you install "tree", instead of texlive, the issue disappears
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y texlive-latex-extra
./child.dockerfile
# syntax=docker/dockerfile:1
FROM parent
RUN echo "Hello, this is the child image!"
./docker-bake.hcl
group "default" {
targets = ["parent", "child"]
}
target "parent" {
context = "."
dockerfile = "parent.dockerfile"
tags = ["parent"]
}
# Removing the "child" target makes the issue disappear
target "child" {
context = "."
dockerfile = "child.dockerfile"
tags = ["child"]
contexts = {
parent = "target:parent"
}
}
Build logs
[+] Building 263.4s (12/12) FINISHED docker:default
=> [parent internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [parent internal] load build definition from parent.dockerfile 0.0s
=> => transferring dockerfile: 314B 0.0s
=> [child] resolve image config for docker.io/docker/dockerfile:1 2.4s
=> [child] docker-image://docker.io/docker/dockerfile:1@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 8.7s
=> => resolve docker.io/docker/dockerfile:1@sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 0.0s
=> => sha256:ac85f380a63b13dfcefa89046420e1781752bab202122f8f50032edf31be0021 8.40kB / 8.40kB 0.0s
=> => sha256:657fcc512c7369f4cb3d94ea329150f8daf626bc838b1a1e81f1834c73ecc77e 482B / 482B 0.0s
=> => sha256:a17ee7fff8f5e97b974f5b48f51647d2cf28d543f2aa6c11aaa0ea431b44bb89 1.27kB / 1.27kB 0.0s
=> => sha256:9d9c93f4b00be908ab694a4df732570bced3b8a96b7515d70ff93402179ad232 11.80MB / 11.80MB 4.0s
=> => extracting sha256:9d9c93f4b00be908ab694a4df732570bced3b8a96b7515d70ff93402179ad232 0.2s
=> [parent internal] load metadata for docker.io/library/debian:bookworm 2.4s
=> [child 1/2] FROM docker.io/library/debian:bookworm@sha256:eaace54a93d7b69c7c52bb8ddf9b3fcba0c106a497bc1fdbb89a6299cf945c63 27.4s
=> => resolve docker.io/library/debian:bookworm@sha256:eaace54a93d7b69c7c52bb8ddf9b3fcba0c106a497bc1fdbb89a6299cf945c63 0.0s
=> => sha256:eaace54a93d7b69c7c52bb8ddf9b3fcba0c106a497bc1fdbb89a6299cf945c63 1.85kB / 1.85kB 0.0s
=> => sha256:8a6e23e1b192b30eff14036a92e9ecdb551a1a10aa8535728b0c13d14d8c9462 529B / 529B 0.0s
=> => sha256:2657a4a0a6d5e8b3515004185275768f115a64a833de40125bb3f6b0b8cc598b 1.46kB / 1.46kB 0.0s
=> => sha256:167b8a53ca4504bc6aa3182e336fa96f4ef76875d158c1933d3e2fa19c57e0c3 49.56MB / 49.56MB 16.2s
=> => extracting sha256:167b8a53ca4504bc6aa3182e336fa96f4ef76875d158c1933d3e2fa19c57e0c3 1.7s
=> [child internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [child internal] load build definition from child.dockerfile 0.1s
=> => transferring dockerfile: 130B 0.0s
=> [parent 2/2] RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y texlive-latex-extra 232.7s
=> [child 1/2] RUN echo "Hello, this is the child image!" 0.3s
=> [parent] exporting to image 12.5s
=> => exporting layers 12.5s
=> => writing image sha256:b400bd392d4f0b10d4b01fc4a978f59073d0d58c2b56a005c1ad77bf23291ce4 0.0s
=> => naming to docker.io/library/parent 0.0s
=> [child] exporting to image 12.4s
=> => exporting layers 12.4s
=> => writing image sha256:78f42581308f5aaa74e133eef3caaa68c74f3aab76dcb820d365fa222213a046 0.0s
=> => naming to docker.io/library/child
Additional info
No response