cli
cli copied to clipboard
After installing docker-ce 25.0.0 when building Dockerimage, the container won't run because of ulimit error
Description
Inside of my Dockerfile, which uses ubuntu:20.04, we install docker-ce. It is essential for us since we need to build AWS CDK code in a custom CodeBuild container.
Here's the Dockerfile code (simplified):
FROM ubuntu:20.04
RUN apt update -y; \
apt upgrade -y; \
apt install software-properties-common -y; \
apt update -y; \
apt install wget -y; \
apt install curl -y; \
apt-get install ca-certificates gnupg lsb-release -y; \
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg; \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null; \
apt-get update -y; \
apt install docker-ce -y; \
apt install unzip -y; \
service docker start; \
rm -rf /var/cache/apt;
ENTRYPOINT service docker start && /bin/bash
COPY install.py .
COPY remove.py .
The container was built successfully up until two days ago. On further investigation, I have found out that this is due to the Docker Engine upgrade to the 25.0.0 version.
If now we try docker run -t container
, we get the following output (the same is printed when running service docker start
while building the container:
service docker start /etc/init.d/docker: 62: ulimit: error setting limit (Invalid argument)
This is due to the line 62 in /etc/init.d/docker file which sets the ulimit hard limit:
ulimit -Hn 524288
Before the most recent 25.0.0 version release, it used to be the following line:
ulimit -n 1048576
When checking the /etc/security/limits.conf
file inside of the Ubuntu image, I found out that the system hard limit is 100000.
If I remove the service docker start
command from the Dockerfile (both in the RUN and ENTRYPOINT commands), the issue persists.
The only way I could make my image run is by hardcoding the previous version of docker-ce:
apt install docker-ce=5:24.0.7-1~ubuntu.20.04~focal -y
This has fixed the problem but is still a huge obstacle for us since we are now forced to use the older version of docker-ce and cannot get updates.
I hope this case will be helpful to anyone having the same problem as we did. I also hope a fix will be introduced so we could get the most recent updates on our image.
Reproduce
- Build an image using a Dockerfile and install the latest version of docker-ce.
- Try running the container
Expected behavior
We expected the container to work on 25.0.0 version of docker engine in the same way it did on 24.0.7
docker version
Not accessible since the container couldn't run.
The version being installed is 25.0.0
docker info
Not accessible since the container couldn't run.
Additional Info
No response
Thanks for reporting. It looks like this is an issue with code in the "engine" code, not the CLI itself, so probably the better location would be in https://github.com/moby/moby. Unfortunately, GitHub doesn't allow transfering tickets between orgs, so I cannot move it there (but perhaps you could open a new ticket?)
This issue is related to https://github.com/moby/moby/commit/c8930105bc9fc3c1a8a90886c23535cc6c41e130, which is part of this PR;
- https://github.com/moby/moby/pull/45534
That PR changed the default ulimits to follow systemd's defaults (only raising the hard-limit, not the soft-limit).
When running the docker service as a systemd unit, this would be handled by systemd, but the sysvinit scripts are provided for non-standard setups where systemd is not used (likely in your container). I wondered if it was perhaps that ubuntu 20.04 didn't provide the H
/ S
(hard / soft limits) option, in which case this could be a packaging issue (adjust the file depending on distro and distro-version; similar to https://github.com/docker/docker-ce-packaging/pull/968) but looks like both 20.04 and 22.04 do;
docker run --rm ubuntu:20.04 bash -c 'ulimit --help'
ulimit: ulimit [-SHabcdefiklmnpqrstuvxPT] [limit]
Modify shell resource limits.
Provides control over the resources available to the shell and processes
it creates, on systems that allow such control.
Options:
-S use the `soft' resource limit
-H use the `hard' resource limit
...
Oh, wait; but you're running service docker start
as part of your docker build
? Does that work? the docker service requires the container to be running with --privileged
, so I'd expect it to fail during a docker build (which doesn't run as --privileged
) 🤔
Oh, wait; but you're running
service docker start
as part of yourdocker build
? Does that work? the docker service requires the container to be running with--privileged
, so I'd expect it to fail during a docker build (which doesn't run as--privileged
) 🤔
It works with the previous docker-ce version, yes. We don't get any errors there.
I have a similar issue
i run bitwarden self-hosting at work i run rocky8 Until now it has worked for years After this update docker-ce 25.0.0 It stopped working I went back to the previous version docker-ce 24.0.7 And everything worked again
Something is wrong with the latest version.
It works with the previous docker-ce version, yes. We don't get any errors there.
Maybe the initial startup worked because no containers were started, so it had just enough privileges to run. I wonder if the previous sysctl
was treated as a no-op if the container was created with the same options, so the previous (ulimit -n 1048576
) would be considered "no changes", therefore succeed.
What version of docker is running on the host (i.e., what version of docker is used to build the image)? Can you provide the output of docker version
and docker info
?
It works with the previous docker-ce version, yes. We don't get any errors there.
Maybe the initial startup worked because no containers were started, so it had just enough privileges to run. I wonder if the previous
sysctl
was treated as a no-op if the container was created with the same options, so the previous (ulimit -n 1048576
) would be considered "no changes", therefore succeed.What version of docker is running on the host (i.e., what version of docker is used to build the image)? Can you provide the output of
docker version
anddocker info
?
docker version
Client:
Cloud integration: v1.0.33
Version: 24.0.2
API version: 1.43
Go version: go1.20.4
Git commit: cb74dfc
Built: Thu May 25 21:51:16 2023
OS/Arch: darwin/arm64
Context: desktop-linux
Server: Docker Desktop 4.20.0 (109717)
Engine:
Version: 24.0.2
API version: 1.43 (minimum version 1.12)
Go version: go1.20.4
Git commit: 659604f
Built: Thu May 25 21:50:59 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.6.21
GitCommit: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc:
Version: 1.1.7
GitCommit: v1.1.7-0-g860f061
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Version: 24.0.2
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.10.5
Path: /Users/myuser/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.18.1
Path: /Users/myuser/.docker/cli-plugins/docker-compose
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.0
Path: /Users/myuser/.docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.19
Path: /Users/myuser/.docker/cli-plugins/docker-extension
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v0.1.0-beta.4
Path: /Users/myuser/.docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /Users/myuser/.docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.26.0
Path: /Users/myuser/.docker/cli-plugins/docker-scan
scout: Command line tool for Docker Scout (Docker Inc.)
Version: v0.12.0
Path: /Users/myuser/.docker/cli-plugins/docker-scout
WARNING: Plugin "/Users/myuser/.docker/cli-plugins/docker-feedback" is not valid: failed to fetch metadata: fork/exec /Users/myuser/.docker/cli-plugins/docker-feedback: no such file or directory
Server:
Containers: 8
Running: 0
Paused: 0
Stopped: 8
Images: 11
Server Version: 24.0.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc version: v1.1.7-0-g860f061
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.15.49-linuxkit-pr
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 5.8GiB
Name: docker-desktop
ID: 3d744663-ef39-4886-a78f-dc0bfd451361
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
Thanks! So, I think this is the issue indeed https://github.com/docker/cli/issues/4807#issuecomment-1903759890;
I wonder if the previous
sysctl
was treated as a no-op if the container was created with the same options, so the previous (ulimit -n 1048576
) would be considered "no changes", therefore succeed.
I tried reproducing the issue; I slightly simplified the Dockerfile, and reduced it to only the essential packages, and split the "install docker" step to a separate RUN
(to allow caching other steps if I had to change something);
# syntax=docker/dockerfile:1
FROM ubuntu:20.04
RUN apt-get update -y; \
apt-get install -y \
ca-certificates \
curl \
gnupg \
lsb-release \
software-properties-common; \
rm -rf /var/cache/apt;
RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg; \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null; \
apt-get update -y; \
apt-get install -y docker-ce; \
rm -rf /var/cache/apt;
RUN \
echo "ulimits: $(ulimit -Sn):$(ulimit -Hn)"; \
service docker start; \
rm -rf /var/cache/apt;
ENTRYPOINT service docker start && /bin/bash
When building the Dockerfile on a docker 25.0 engine with BuildKit, the build works without errors;
docker build -t foo --no-cache --progress=plain .
#8 22.27 Setting up docker-ce (5:25.0.0-1~ubuntu.20.04~focal) ...
#8 22.30 invoke-rc.d: could not determine current runlevel
#8 22.31 invoke-rc.d: policy-rc.d denied execution of start.
#8 22.44 Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /lib/systemd/system/docker.service.
#8 22.57 Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /lib/systemd/system/docker.socket.
#8 22.57 Setting up xauth (1:1.1-0ubuntu1) ...
#8 22.58 Setting up liberror-perl (0.17029-1) ...
#8 22.58 Setting up git (1:2.25.1-1ubuntu3.11) ...
#8 22.61 Processing triggers for libc-bin (2.31-0ubuntu9.14) ...
#8 22.64 Processing triggers for systemd (245.4-4ubuntu3.22) ...
#8 22.65 Processing triggers for mime-support (3.64ubuntu1) ...
#8 DONE 22.7s
#9 [4/4] RUN echo "ulimits: $(ulimit -Sn):$(ulimit -Hn)"; service docker start; rm -rf /var/cache/apt;
#9 0.320 ulimits: 524288:1024
#9 0.334 * Starting Docker: docker
#9 0.336 ...done.
#9 DONE 0.3s
#10 exporting to image
#10 exporting layers
#10 exporting layers 3.0s done
#10 writing image sha256:92263a9e8ab60769e2127898c6c2fa5489aa66a383df8523862dcae96962043e
#10 writing image sha256:92263a9e8ab60769e2127898c6c2fa5489aa66a383df8523862dcae96962043e done
#10 naming to docker.io/library/foo done
#10 DONE 3.1s
When using the the legacy builder (BuildKit disabled through DOCKER_BUILDKIT=0
) however, it fails
The classic builder starts containers through containerd, whereas BuildKit starts containers through BuildKit (in which case ulimit
(LIMIT_NOFILE
) of the dockerd
service are applied, not those from containerd
);
DOCKER_BUILDKIT=0 docker build -t foo --no-cache .
...
Setting up docker-ce (5:25.0.0-1~ubuntu.20.04~focal) ...
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of start.
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /lib/systemd/system/docker.service.
Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /lib/systemd/system/docker.socket.
Setting up xauth (1:1.1-0ubuntu1) ...
Setting up liberror-perl (0.17029-1) ...
Setting up git (1:2.25.1-1ubuntu3.11) ...
Processing triggers for libc-bin (2.31-0ubuntu9.14) ...
Processing triggers for systemd (245.4-4ubuntu3.22) ...
Processing triggers for mime-support (3.64ubuntu1) ...
---> Removed intermediate container 968fb0a992f2
---> c3500524a198
Step 4/5 : RUN echo "ulimits: $(ulimit -Sn):$(ulimit -Hn)"; service docker start; rm -rf /var/cache/apt;
---> Running in b8bb305e36bc
ulimits: 1073741816:1073741816
/etc/init.d/docker: 62: ulimit: error setting limit (Invalid argument)
---> Removed intermediate container b8bb305e36bc
---> 59eb60be41b5
Step 5/5 : ENTRYPOINT service docker start && /bin/bash
---> Running in b0455bfa8db2
---> Removed intermediate container b0455bfa8db2
---> 2bcfe0c9b509
Successfully built 2bcfe0c9b509
Successfully tagged foo:latest
When running the build on an older version of docker with the previous LIMIT_NOFILE, it fails with BuildKit as well. Here's the same build on a docker 24.0.2 (Ubuntu 18.04) test-machine that I didn't update yet;
docker build -t foo --no-cache --progress=plain .
#8 41.47 Setting up docker-ce (5:25.0.0-1~ubuntu.20.04~focal) ...
#8 41.55 invoke-rc.d: could not determine current runlevel
#8 41.56 invoke-rc.d: policy-rc.d denied execution of start.
#8 41.84 Created symlink /etc/systemd/system/multi-user.target.wants/docker.service → /lib/systemd/system/docker.service.
#8 42.08 Created symlink /etc/systemd/system/sockets.target.wants/docker.socket → /lib/systemd/system/docker.socket.
#8 42.09 Setting up xauth (1:1.1-0ubuntu1) ...
#8 42.10 Setting up liberror-perl (0.17029-1) ...
#8 42.12 Setting up git (1:2.25.1-1ubuntu3.11) ...
#8 42.17 Processing triggers for libc-bin (2.31-0ubuntu9.14) ...
#8 42.26 Processing triggers for systemd (245.4-4ubuntu3.22) ...
#8 42.27 Processing triggers for mime-support (3.64ubuntu1) ...
#8 DONE 42.5s
#9 [4/4] RUN echo "ulimits: $(ulimit -Sn):$(ulimit -Hn)"; service docker start; rm -rf /var/cache/apt;
#9 0.822 ulimits: 1048576:1048576
#9 0.857 /etc/init.d/docker: 62: ulimit: error setting limit (Invalid argument)
#9 DONE 1.0s
#10 exporting to image
#10 exporting layers
#10 exporting layers 14.3s done
#10 writing image sha256:f785f12a2297b42af040deca8ca043e025
I think your build effectively happened to be "lucky" and managed to start because it didn't have to adjust the ulimits
, and therefore JUST enough privileges to start the service (the ulimit
of the container used during build already had the correct values set (making the ulimit -n 1048576
a no-op, so no privileges required).
Also trying what works and what doesn't; (also see setrlimit(2), and ulimit
An unprivileged process may only set its soft limit to a value in the range from 0 up to the hard limit, and (irreversibly) lower its hard limit. A privileged process can make arbitrary changes to either limit value.
- Raising hard limit (524288 -> 1048576) (error)
- Raising soft limit (1024 -> 2048) works
docker run --rm --ulimit nofile=1024:524288 ubuntu:20.04 sh -c ' echo "before: $(ulimit -Hn):$(ulimit -Sn)"; ulimit -Sn 2048; ulimit -Hn 1048576; echo "after: $(ulimit -Hn):$(ulimit -Sn)"'
sh: 1: ulimit: error setting limit (Operation not permitted)
before: 524288:1024
after: 524288:2048
- Lowering hard limit (1048576 -> 524288) works
- Lowering soft limit (2048 -> 1024) works
docker run --rm --ulimit nofile=2048:1048576 ubuntu:20.04 sh -c ' echo "before: $(ulimit -Hn):$(ulimit -Sn)"; ulimit -Sn 1024; ulimit -Hn 524288; echo "after: $(ulimit -Hn):$(ulimit -Sn)"'
before: 1048576:2048
after: 524288:1024
Keeping both the same also works;
docker run --rm --ulimit nofile=1024:524288 ubuntu:20.04 sh -c ' echo "before: $(ulimit -Hn):$(ulimit -Sn)"; ulimit -Sn 1024; ulimit -Hn 524288; echo "after: $(ulimit -Hn):$(ulimit -Sn)"'
before: 524288:1024
after: 524288:1024
Note that these ulimits are configured for the docker
and containerd
services, so their values may differ between hosts, and even between systemd versions, which means that your build could fail depending on that.
Given that you're trying to run the docker engine in a container and ulimits can be set for the container at runtime (using the --ulimit
flag, see above), which will be inherited by processes inside the container, I think a good solution for your use-case would be to just disable the ulimit
in the init script;
sed -i 's/ulimit -Hn/# ulimit -Hn/g' /etc/init.d/docker;
Here's my Dockerfile with that applied;
# syntax=docker/dockerfile:1
FROM ubuntu:20.04
RUN apt-get update -y; \
apt-get install -y \
ca-certificates \
curl \
gnupg \
lsb-release \
software-properties-common; \
rm -rf /var/cache/apt;
RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg; \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null; \
apt-get update -y; \
apt-get install -y docker-ce; \
rm -rf /var/cache/apt;
RUN \
echo "ulimits: $(ulimit -Sn):$(ulimit -Hn)"; \
sed -i 's/ulimit -Hn/# ulimit -Hn/g' /etc/init.d/docker; \
service docker start; \
rm -rf /var/cache/apt;
ENTRYPOINT service docker start && /bin/bash
With that change, the build succeeds succesfuly;
#9 [4/4] RUN echo "ulimits: $(ulimit -Sn):$(ulimit -Hn)"; sed -i 's/ulimit -Hn/# ulimit -Hn/g' /etc/init.d/docker; service docker start; rm -rf /var/cache/apt;
#9 0.864 ulimits: 1048576:1048576
#9 0.906 * Starting Docker: docker
#9 0.910 ...done.
#9 DONE 1.0s
I got the same error following the installation instructions on https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository
@eloaf what is the issue you're having? Is that running the docker engine as part of a Dockerfile? For that see my comment above.
Yes - starting the docker service yields the error
checking the system's ulimit -Hn
then modifying the ulimit set in /etc/init.d/docker
fixes it, similar to your fix.
Its weird I remember having the same issue years ago, then nothing, then I get it again today after installing on ubuntu 25.0.0
Its weird I remember having the same issue years ago, then nothing, then I get it again today after installing on ubuntu 25.0.0
Yes, it's possible these limits changed over time, and (per my comment above) "changing" the limit only worked if no actual changes were applied (so it being a no-op).
Curious; is there a reason to start the docker
service as part of the build? Is it only to verify install was successful, or does it serve any other purpose?
Its weird I remember having the same issue years ago, then nothing, then I get it again today after installing on ubuntu 25.0.0
Yes, it's possible these limits changed over time, and (per my comment above) "changing" the limit only worked if no actual changes were applied (so it being a no-op).
Curious; is there a reason to start the
docker
service as part of the build? Is it only to verify install was successful, or does it serve any other purpose?
I mean, I need to start the docker service at some point to perform any builds? (I was working on a node that would get spun up and down didnt persist the installation)
But this is a docker service inside a container, and the container is part of the Dockerfile / ran during docker build
. Running the docker service requires a privileged (--privileged
) container. Containers that are used during docker build
do not support --privileged
(by design).
It's still possible to run the resulting image that was built as privileged, but in that case the container must be started with the --privileged
.