moby
moby copied to clipboard
Fedora Docker-CE-Engine 20.10.13 consumes all available system memory (kernel 5.16.13)
Description The issue occured on my Fedora version: Fedora release 35 Kernel Information: Linux 5.16.12-200.fc35.x86_64 #1 SMP PREEMPT Wed Mar 2 19:06:17 UTC 2022
When starting a docker-compose project with mysql with Docker-CE-Engine Version 20.10.13 it consumes all available system memory. With version 20.10.10 the issue is non existing and the docker-compose project requires only ~2GB of RAM.
Steps to reproduce the issue:
- setup a docker-compose project including mysql:5.6
- run the project with docker-compose
- monitor memory usage with for example activity monitor
Describe the results you received: All available system memory is consumed and the system stops working at some point.
Describe the results you expected: I would expected around the same memory consumption as with the old working Version 20.10.10
Additional information you deem important (e.g. issue happens only occasionally): The issue was reproducible and only a downgrade to 20.10.10 could solve the issue.
Version info where the issue occured: Client: Docker Engine - Community Version: 20.10.13 API version: 1.41 Go version: go1.16.15 Git commit: a224086 Built: Thu Mar 10 14:08:18 2022 OS/Arch: linux/amd64 Context: default Experimental: true
Server: Docker Engine - Community Engine: Version: 20.10.13 API version: 1.41 (minimum version 1.12) Go version: go1.16.15 Git commit: 906f57f Built: Thu Mar 10 14:06:06 2022 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.5.10 GitCommit: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc runc: Version: 1.0.3 GitCommit: v1.0.3-0-gf46b6ba docker-init: Version: 0.19.0 GitCommit: de40ad0
Additional environment details (AWS, VirtualBox, physical, etc.): physical hardware.
I got the same issue with mysql:5.7
image & 20.10.13 version, memory consumption is very high so system is swaping and the startup sequence is extremely slow. Can easily be reproduced with a docker run -i mysql:5.7
It seems that the problem doesn't seem to exist with mysql:8.0
and, as a matter of fact, everything worked with previous docker version.
I downgraded to 20.10.12, 20.10.11 and 20.10.10 (the 3 last versions available in the official repo) and I still hit the same issue. That's maybe a kernel issue.
Thanks for reporting; so to reproduce the issue, just a docker run -i mysql:5.7
(no other options) is sufficient?
If that's the case, that's odd indeed. As a workaround to prevent the system from running out of memory, you could of course add memory constraints to the container itself (but that wouldn't fix the underlying issue, just possibly prevent it from consuming all memory).
Could you perhaps also add the output of docker info
? (that contains additional information, such as kernel version, storage driver etc); of course feel free to redact information where needed.
Thanks for reporting; so to reproduce the issue, just a docker run -i mysql:5.7 (no other options) is sufficient?
Yes, it's as simple as that. The output will be the following for a while (while eating all the RAM) and eventually the init will continue.
2022-03-14 12:18:46+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
As a workaround to prevent the system from running out of memory, you could of course add memory constraints to the container itself
Actually, setting a memory constraint makes mysql init fail:
docker run --memory 1073741824 -i mysql:5.7
2022-03-14 12:25:12+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.37-1debian10 started.
2022-03-14 12:25:16+00:00 [ERROR] [Entrypoint]: mysqld failed while attempting to check config
command was: mysqld --verbose --help --log-bin-index=/tmp/tmp.sz9LdwWe78
(Same command works with mysql:8.0)
docker info, here it is:
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
scan: Docker Scan (Docker Inc., v0.17.0)
Server:
Containers: 12
Running: 0
Paused: 0
Stopped: 12
Images: 80
Server Version: 20.10.13
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 2a1d4dbdb2a1030dc5b01e96fb110a9d9f150ecc
runc version: v1.0.3-0-gf46b6ba
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.14.10-300.fc35.x86_64
Operating System: Fedora Linux 35 (Workstation Edition)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.39GiB
Name: localhost.localdomain
ID: BVZQ:2MR3:XMZ6:OCVR:RHF2:SLKM:UIVC:KELR:PYSI:PW7R:2GX5:D3FB
Docker Root Dir: /home/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
I switched back to older kernel version (5.14.10 here, identified the issue with 5.16.13)
FTR, @LeSuisse identified that a rollback to containerd.io-1.4.13-3.1.fc35 solves the problem
Using a different kernels (5.14.18-300.fc35, 5.16.14-200.fc35) and Docker CE Engines (20.10.12, 20.10.11, 20.10.10) did not resolve the issue for me. Only downgrading to containerd.io-1.4.13-3.1.fc35 resolved the memory leak.
Here's my docker info
output of the stable setup:
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Docker Buildx (Docker Inc., v0.8.0-docker)
scan: Docker Scan (Docker Inc., v0.17.0)
Server:
Containers: 12
Running: 12
Paused: 0
Stopped: 0
Images: 146
Server Version: 20.10.13
Storage Driver: btrfs
Build Version: Btrfs v5.16.2
Library Version: 102
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9cc61520f4cd876b86e77edfeb88fbcd536d1f9d
runc version: v1.0.3-0-gf46b6ba
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.16.13-200.fc35.x86_64
Operating System: Fedora Linux 35 (Workstation Edition)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 23.23GiB
Name: localhost.localdomain
ID: [REDACTED]
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Are you seeing the same happening if you run the container through containerd?
Something like;
ctr image pull docker.io/library/mysql:5.7
ctr run --env MYSQL_ALLOW_EMPTY_PASSWORD=1 -t docker.io/library/mysql:5.7 mycontainer
Running container through containerd (both 1.4.13-3.1.fc35 and 1.5.10-3.1.fc35) does not create the memory leak.
However in order to run the container I had to do some mounting trickery (hopefully it does not yield into an apples to oranges comparison):
cd /home/kait
mkdir run
chmod 777 run
ctr run --rm --mount "type=bind,src=/home/kait/run,dst=/var/run/mysqld,options=rbind:rw" --env MYSQL_ALLOW_EMPTY_PASSWORD=1 docker.io/library/mysql:5.7 mycontainer
Otherwise the container initialization would fail with error:
2022-03-18T07:28:20.058885Z 0 [ERROR] Could not create unix socket lock file /var/run/mysqld/mysqld.sock.lock.
2022-03-18T07:28:20.058893Z 0 [ERROR] Unable to setup unix socket lock file.
2022-03-18T07:28:20.058897Z 0 [ERROR] Aborting
And the server would shut down.
Is there anything we can do here to make it move forward ? Should we report the issue to fedora as well ?
Is there any update about this?
Small update for the people using fedora with the upgrade to Fedora 36 there is no way to downgrade containerd. Just learned this the hard way after upgrading and the bug still existing 😅.
Maybe interesting for those who also upgraded already to fedora 36 i found a way to still downgrade to the working versions by specifying the fedora release version in dnf:
#!/bin/bash
sudo dnf --releasever=35 downgrade docker-ce-3:20.10.10 docker-ce-cli-3:20.10.10 containerd.io-1.4.13
Hello,
I've upgraded my workstation to Fedora 36 with the last versions of containerd.io and docker-ce and the issue is still here. Only the downgrade suggested by @kevin0x90 seems to provide a running MySQL container without consuming all the memory.
How can we help you to solve this issue?
Hello,
I've upgraded my workstation to Fedora 36 with the last versions of containerd.io and docker-ce and the issue is still here. Only the downgrade suggested by @kevin0x90 seems to provide a running MySQL container without consuming all the memory.
How can we help you to solve this issue?
This how it works for me on Fedora 36:
Downgrade containerd.io
as @kevin0x90 wrote:
sudo dnf --releasever=35 downgrade docker-ce-3:20.10.10 docker-ce-cli-3:20.10.10 containerd.io-1.4.13
Then freeze containerd.io
version to prevent further upgrading:
sudo dnf install 'dnf-command(versionlock)'
sudo dnf versionlock containerd.io-1.4.13
Maybe some good to know addition to the versionlock is that if you use gnome software for updates it will ignore the versionlock in dnf https://bugzilla.redhat.com/show_bug.cgi?id=1671489 I just stumbled about this recently.
For the record, switching from docker-ce to moby & all provided by fedora solved the issue for me
@vaceletm could show a direction to dig about switching from docker-ce to mody?
Here is the full script of what I had to do, some of the change might be related to composer v2 switch (builtkit by default but I didn't track down everything):
$> dnf install moby-engine --allowerasing
$> sudo systemctl edit docker
[Service]
LimitNOFILE=1024
$> sudo systemctl daemon-reload
$> sudo setenforce disabled
$> vim /etc/selinux/config
SELINUX=permissive
$> sudo systemctl restart docker
Be careful: with this approach you disable selinux on your platform, you might be at risk then. Evaluate the consequences beforehand.
Just came across this issue yesterday. Figure I'd provide additional info and which solution worked best for me.
Kernel: 5.18.18-200.fc36.x86_64
Docker version:
Client: Docker Engine - Community
Version: 20.10.17
API version: 1.41
Go version: go1.17.11
Git commit: 100c701
Built: Mon Jun 6 23:03:59 2022
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.17
API version: 1.41 (minimum version 1.12)
Go version: go1.17.11
Git commit: a89b842
Built: Mon Jun 6 23:01:39 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.7
GitCommit: 0197261a30bf81f1ee8e6a4dd2dea0ef95d67ccb
runc:
Version: 1.1.3
GitCommit: v1.1.3-0-g6724737
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker run --rm apache/airflow:2.3.3-python3.9 scheduler
worked fine.
docker run --rm apache/airflow:2.3.4-python3.9 scheduler
ate up memory.
I tried uninstalling docker-ce's docker-engine and installing Fedora's moby-engine, which worked, but ran into SELinux issues as mentioned above.
What works decently well for me is Docker Desktop for Linux. I just enable "Start Docker Desktop when you log in" (and change other settings...), and then change the CLI's context via:
docker context ls
docker context use desktop-linux
What's nice is that you can run docker
commands without sudo
.
Other than running into UID-related issues, things seem to be working fine.
Client: Docker Engine - Community
Cloud integration: v1.0.28
Version: 20.10.17
API version: 1.41
Go version: go1.17.11
Git commit: 100c701
Built: Mon Jun 6 23:03:59 2022
OS/Arch: linux/amd64
Context: desktop-linux
Experimental: true
Server: Docker Desktop 4.11.1 (84025)
Engine:
Version: 20.10.17
API version: 1.41 (minimum version 1.12)
Go version: go1.17.11
Git commit: a89b842
Built: Mon Jun 6 23:01:23 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.6
GitCommit: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
runc:
Version: 1.1.2
GitCommit: v1.1.2-0-ga916309
docker-init:
Version: 0.19.0
GitCommit: de40ad0
See: https://github.com/containerd/containerd/pull/7566#issuecomment-1285417325
Hi. I have Fedora-36 and solved this issue by changing in /usr/lib/systemd/system/containerd.service
LimitNOFILE=infinity to LimitNOFILE=1048576. Reboot and all works.
Looks like this is effectively a duplicate of / covered by https://github.com/moby/moby/issues/38814, and will be addressed by https://github.com/moby/moby/pull/45534