for-linux icon indicating copy to clipboard operation
for-linux copied to clipboard

Docker leaks disk space?

Open bra-fsn opened this issue 3 years ago • 15 comments

  • [x] This is a bug report
  • [ ] This is a feature request
  • [x] I searched existing issues before opening this one

Expected behavior

Docker seems to use too much disk space in /var/lib/docker. I did a test:

  • stopped and removed all containers
  • removed all volumes
  • docker pruned everything (docker image prune -fa, docker system prune -fa, docker buildx prune -fa)

After this, I would expect the space usage to drop to nearly 0.

Actual behavior

$ docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          0         0         0B        0B
Containers      0         0         0B        0B
Local Volumes   0         0         0B        0B
Build Cache     0         0         0B        0B
$ docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
$ docker volume ls 
DRIVER    VOLUME NAME
$ docker buildx du
Reclaimable:	0B
Total:		0B
$ docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
$ docker image ls -a
REPOSITORY   TAG       IMAGE ID   CREATED   SIZE
$ sudo du -hs /var/lib/docker/
192G	/var/lib/docker/
$ sudo systemctl restart docker
$ sudo du -hs /var/lib/docker/
192G	/var/lib/docker/

Steps to reproduce the behavior

Use Docker for a while.

Output of docker version:

Client: Docker Engine - Community
 Version:           20.10.17
 API version:       1.41
 Go version:        go1.17.11
 Git commit:        100c701
 Built:             Mon Jun  6 23:02:57 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.17
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.11
  Git commit:       a89b842
  Built:            Mon Jun  6 23:01:03 2022
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc:
  Version:          1.1.2
  GitCommit:        v1.1.2-0-ga916309
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Output of docker info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.2-0-ga916309
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-122-generic
 Operating System: Ubuntu 20.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 61.37GiB
 Name: ip-10-150-4-168
 ID: ZJFE:DJF7:DUGP:WOGK:A66E:IVQS:X6HB:CKP4:SBRG:VROU:ZIBO:GWXF
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.) AWS EC2 node

bra-fsn avatar Aug 01 '22 11:08 bra-fsn

Do you know what directory inside /var/lib/docker is consuming that space?

thaJeztah avatar Aug 18 '22 10:08 thaJeztah

Do you know what directory inside /var/lib/docker is consuming that space?

I should've included that, sorry @thaJeztah. All containers stopped, before cleanup:

# df -h /
/dev/nvme0n1p1  1.5T  888G  529G  63% /
# docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          265       16        279.2GB   276.3GB (98%)
Containers      29        0         8.684MB   8.684MB (100%)
Local Volumes   20        16        119.5GB   4.052GB (3%)
Build Cache     3293      0         2.194kB   2.194kB
# ncdu /var/lib/docker
--- /var/lib/docker ------------------------------------------------------------
  511.2 GiB [##########] /overlay2                                              
  110.6 GiB [##        ] /volumes
   40.0 GiB [          ] /containers
    3.2 GiB [          ] /image
  848.7 MiB [          ] /buildkit
  296.0 KiB [          ] /network
   16.0 KiB [          ] /plugins
e   4.0 KiB [          ] /trust
e   4.0 KiB [          ] /tmp
e   4.0 KiB [          ] /swarm
e   4.0 KiB [          ] /runtimes

Pruning:

# docker container rm $(docker container ls -aq)
# docker buildx prune -af
Total:	262.6GB
# docker volume prune  -f
Total reclaimed space: 115.5GB
# docker image prune -af
Total reclaimed space: 279GB
# docker system prune -af --volumes
# docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          0         0         0B        0B
Containers      0         0         0B        0B
Local Volumes   0         0         0B        0B
Build Cache     0         0         0B        0B
# ncdu /var/lib/docker
--- /var/lib/docker ------------------------------------------------------------
  190.1 GiB [##########] /overlay2                                              
    1.8 GiB [          ] /image  
  856.0 MiB [          ] /buildkit  
  296.0 KiB [          ] /network
   36.0 KiB [          ] /volumes 
   16.0 KiB [          ] /plugins
e  12.0 KiB [          ] /containers
e   4.0 KiB [          ] /trust
e   4.0 KiB [          ] /tmp
e   4.0 KiB [          ] /swarm
e   4.0 KiB [          ] /runtimes
# df -h /
/dev/nvme0n1p1  1.5T  415G 1002G  30% /

bra-fsn avatar Aug 25 '22 23:08 bra-fsn

I have the very same behaviour here.

# docker info:
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.8.2-docker)
  scan: Docker Scan (Docker Inc., v0.17.0)

Server:
 Containers: 6
  Running: 6
  Paused: 0
  Stopped: 0
 Images: 6
 Server Version: 20.10.17
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
 runc version: v1.1.4-0-g5fd4c4d
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.10.0-17-amd64
 Operating System: Debian GNU/Linux 11 (bullseye)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31.22GiB
 Name: deb-server
 ID: ND6D:GYBM:OY2F:H56N:S7HX:C7KL:TAKF:J6QZ:ICJS:I3AC:UXLC:ZVUB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

# docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          6         6         4.502GB   129.7MB (2%)
Containers      6         6         207.5GB   0B (0%)
Local Volumes   1         1         27.61MB   0B (0%)
Build Cache     0         0         0B        0B

# docker images
REPOSITORY                               TAG       IMAGE ID       CREATED        SIZE
lscr.io/linuxserver/paperless-ngx        latest    367f608b8b16   2 weeks ago    1.23GB
linuxserver/jellyfin                     latest    aa09b1b724a4   2 weeks ago    795MB
ownyourbits/nextcloudpi-x86              latest    417ac417454a   2 weeks ago    1.39GB
cr.portainer.io/portainer/portainer-ce   latest    ab836adaa325   5 weeks ago    278MB
jc21/nginx-proxy-manager                 latest    7c775dbb91f2   5 months ago   921MB
containrrr/watchtower                    latest    333de6ea525a   7 months ago   16.9MB

uuuu1234 avatar Sep 04 '22 06:09 uuuu1234

I'm observing totally the same behavior.

# docker version
Client: Docker Engine - Community
 Version:           20.10.23
 API version:       1.41
 Go version:        go1.18.10
 Git commit:        7155243
 Built:             Thu Jan 19 17:36:25 2023
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.23
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.18.10
  Git commit:       6051f14
  Built:            Thu Jan 19 17:34:14 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.15
  GitCommit:        5b842e528e99d4d4c1686467debf2bd4b88ecd86
 runc:
  Version:          1.1.4
  GitCommit:        v1.1.4-0-g5fd4c4d
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Running docker prune or docker buildx prune doesn't clean all the data - there are some layers consume disk space under /var/lib/docker/overlay2/ folder. Only stopping docker service with removing completely /var/lib/docker/ folder helps for some time.

gilgameshfreedom avatar Feb 23 '23 20:02 gilgameshfreedom

Same here after docker system prune -a -f --volumes:

$ docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          0         0         0B        0B
Containers      0         0         0B        0B
Local Volumes   0         0         0B        0B
Build Cache     0         0         0B        0B

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1      1007G  514G  443G  54% /var/lib/docker

derselbst avatar Apr 18 '23 15:04 derselbst

I have the same issue, on multiple systems.

philhabs avatar Jun 16 '23 19:06 philhabs

Same here on two systems: Docker version 24.0.2, build cb74dfc

raisti78 avatar Jul 04 '23 09:07 raisti78

I don't have a clean self-contained repro, but if it helps: it seems it started happening after I modified my Dockerfile to split a container into two, reusing the same image for both with a different run command. Everything built locally.

philhabs avatar Jul 04 '23 12:07 philhabs

We observed the same behaviour on a CI server. Docker version 24.0.5 .

jgoelen avatar Sep 04 '23 06:09 jgoelen

  • possibly related: https://github.com/moby/moby/issues/46136

thaJeztah avatar Sep 04 '23 07:09 thaJeztah

I have the exact same behavior and our CI systems are running out of disk space

root@cibuildarm6401:~# docker system prune --all --volumes --force
Total reclaimed space: 0B


root@cibuildarm6401:~# docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          0         0         0B        0B
Containers      0         0         0B        0B
Local Volumes   0         0         0B        0B
Build Cache     0         0         0B        0B


root@cibuildarm6401:~# du /var/lib/docker/overlay2 -sh
59G     /var/lib/docker/overlay2


root@cibuildarm6401:~# docker --version
Docker version 24.0.7, build afdd53b


root@cibuildarm6401:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal

root@cibuildarm6401:~# uname -a
Linux cibuildarm6401 5.4.0-165-generic #182-Ubuntu SMP Mon Oct 2 19:45:52 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux


nook24 avatar Oct 27 '23 09:10 nook24

Same here.

# docker --version
Docker version 20.10.12, build e91ed57
# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.3 LTS
Release:	20.04
Codename:	focal
# uname -a
Linux ip-172-31-49-194 5.15.0-1040-aws #45~20.04.1-Ubuntu SMP Tue Jul 11 19:08:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

daithi-walker avatar Oct 31 '23 18:10 daithi-walker

Run into this again. As Workaround I nuked docker entirely and reinstalled it.

apt-get remove  docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
rm -rf /var/lib/docker
apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

nook24 avatar Feb 22 '24 09:02 nook24

The same issue is on my local machine when deploying with Kamal which uses Docker. After several deploys, It breaks with the error no space left on device - the only 75MB left. Here is the opened issue https://github.com/basecamp/kamal/issues/765

abratashov avatar Apr 09 '24 15:04 abratashov

@abratashov From your linked ticket (https://github.com/basecamp/kamal/issues/765), I suspect you're running a containerised BuildKit builder (using the docker-container driver);

I've found a workaround in this comment docker/for-linux#1423 (comment) Only removing the folders freed up space::

sudo rm -R /var/lib/docker/volumes/buildx_buildkit_kamal-my-project-demo-multiarch0_state
...

In that case, build is executed by a standalone BuildKit daemon that runs inside a container. Such instances use a volume to persist state (including build-cache). When cleaning up date (using docker system prune), docker won't clean up such volumes; if the volume is still in use by a container, it can't (because the container is using it), for named volumes, it won't delete such volumes (as named volumes are designed to persist data beyond the container's lifecycle; such data may be (e.g.) data used for a database running in a container, so deleting the volume automatically would be a destructive operation). It's possible to configure BuildKit to perform garbage collection of old or unused build-cache, but this may require additional configuration options when creating the builder.

thaJeztah avatar Apr 10 '24 08:04 thaJeztah