Docker Release 28.5.2 breaks Buildx Build on Sysbox Runtime version v0.6.7
@ctalledo
Docker Release 28.5.2, with following changes on sysbox runtime version v0.6.7 on Ubuntu 24.04 now get following errors, during docker buildx build
Packaging updates Update BuildKit to v0.25.2. moby/moby#51398 Update Go runtime to 1.24.9. moby/moby#51387, docker/cli#6613 Update runc to v1.3.3. moby/moby#51394
Docker Engine version 28.5.2 release notes
Error:
container-builder
#1 [internal] booting buildkit
#1 pulling image artifactory.company.com:17114/moby/buildkit:v0.25.2
#1 pulling image artifactory.company.com:17114/moby/buildkit:v0.25.2 2.5s done
#1 creating container buildx_buildkit_container-builder0
#1 creating container buildx_buildkit_container-builder0 0.4s done
#1 ERROR: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: open sysctl net.ipv4.ip_unprivileged_port_start file: unsafe procfs detected: openat2 /proc/./sys/net/ipv4/ip_unprivileged_port_start: invalid cross-device link: unknown
------
> [internal] booting buildkit:
------
ERROR: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: open sysctl net.ipv4.ip_unprivileged_port_start file: unsafe procfs detected: openat2 /proc/./sys/net/ipv4/ip_unprivileged_port_start: invalid cross-device link: unknown
Docker release contains fixes for three high-severity security vulnerabilities in runc:
CVE-2025-31133 CVE-2025-52565 CVE-2025-52881
All three vulnerabilities ultimately allow (through different methods) for full container breakouts by bypassing runc's restrictions for writing to arbitrary /proc files.
It's not only buildx. Docker inside docker doesn't work now 😢
Reproducer:
docker run -it debian:trixie-slim bash
# install docker inside docker (see https://docs.docker.com/engine/install/debian/):
apt-get update
apt-get install ca-certificates curl
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
chmod a+r /etc/apt/keyrings/docker.asc
tee /etc/apt/sources.list.d/docker.sources <<EOF
Types: deb
URIs: https://download.docker.com/linux/debian
Suites: $(. /etc/os-release && echo "$VERSION_CODENAME")
Components: stable
Signed-By: /etc/apt/keyrings/docker.asc
EOF
apt-get update
apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# fix the service file
sed -i 's/ulimit -Hn/# ulimit -Hn/g' /etc/init.d/docker
service docker start
>> docker --version
Docker version 28.5.2, build ecc6942
>> docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
17eec7bbc9d7: Pull complete
Digest: sha256:56433a6be3fda188089fb548eae3d91df3ed0d6589f7c2656121b911198df065
Status: Downloaded newer image for hello-world:latest
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: open sysctl net.ipv4.ip_unprivileged_port_start file: unsafe procfs detected: openat2 /proc/./sys/net/ipv4/ip_unprivileged_port_start: invalid cross-device link: unknown
Run 'docker run --help' for more information
https://github.com/opencontainers/runc/issues/4968
https://github.com/containerd/containerd/issues/12484#issuecomment-3494405793
@pdziuba This is due to a design flaw in AppArmor. https://github.com/opencontainers/runc/issues/4968 lists the necessary workarounds.
@abhi4u1947 The error you are getting is different and would be caused by a bind mount being placed on top of /proc/... and is thus an expected error. I don't know if sysbox does this, but if it does try to fake procfs files that won't work with runc anymore because it is an attack vector we needed to close.
if it does try to fake procfs files
I am not knowledgeable about sysbox internals, but "Virtualizes portions of procfs & sysfs inside the container." (from the readme) would seem to suggest that it does indeed
Ran into this yesterday. I'm running systemd+Docker in a container. Downgrading containerd.io to 1.7.28-1~ubuntu.24.04~noble in the inner Docker worked for me.
apt-get install -y --allow-downgrades containerd.io=1.7.28-1~ubuntu.24.04~noble \
&& apt-mark hold containerd.io
Thank you for this finding. So updating the docker-daemon and the docker-cli is possible. Looks logical. The 3 CVEs are in runc which is distributed within the containerd.io package.
I'm experiencing the same issue for builds and running containers.
@cyphar, is the problem now in hands of the AppArmor kernel team, or will all dependent projects need to adapt to the recent changes?
I am running dind image under sysbox-runc runtime and I struggle with running inner containers:
- Downgrading host's
containerddoes not help - Image
28-dindwith internalcontianerdversionv1.7.28does not work - Downgrading
dindimage to27-dindwith internalcontainerdversionv1.7.25works even when host'scontainerdup-to-date (versionv2.1.5).
Simple test case is:
docker run --rm -it --runtime sysbox-runc --name dind-test docker:28-dinddocker exec -it dind-test docker run --rm -it alpine
With 28-dind it does not work, with 27-dind it works.
Update (1): The downgrade issue is strange. Other reports mention that downgrading to containerd v1.7.28 helps, but actually 28-dind has this version. Yet it does not work.
Update (2): Is there anything on the sysbox-runc side, which needs to be done? Or is the issue in the Docker dind image actually?
@cruizba Are you asking about https://github.com/opencontainers/runc/issues/4968 (the permission error due to /proc/sys writes being treated as though they were /sys writes with AppArmor)? I am going to be speaking at a kernel conference in early December and plan to discuss this with some AppArmor folks in person.
However, I suspect solving this problem may be quite difficult, and there are no real workarounds we can apply from runc, so downstreams will probably just have to live with adjusting their AppArmor profiles to no longer block /sys/* writes if you allow nested containers (though as discussed in https://github.com/opencontainers/runc/issues/4968, the protection you got from this was easily bypassed if you allowed nesting anyway).
If you're talking about the error described in this bug (openat2 /proc/./sys/net/ipv4/ip_unprivileged_port_start: invalid cross-device link) then as I said above, this is intended behaviour and sysbox will need to either stop putting fake procfs files in /proc or patch their sysbox-runc to effectively remove the protection against CVE-2025-52881. runc is returning an error here (EXDEV) because it detected a fake procfs file, this is just our new protections working as intended.
EDIT: Though I am a little surprised that runc will even see the overmounts -- we now try to use fsopen(2) or open_tree(2) to get a private handle to procfs free of overmounts, but I guess there is seccomp policy blocking this? (I wonder if this being allowed may also break sysbox in other ways, if their fake /proc is actually needed for their system to work...)
Thanks for confirming that my issue is really different and is about sysbox-runc and not about containerd, this was really confusing (and even when I saw recommendation to downgrade the containerd in this thread, I got lost).
I am a bit confused about the source (runc/sysbox/apparmor).
In our setup we use: host: docker cli 29.0.1, daemon 29.0.1, containerd 2.1.5 and sysbox-runc 0.6.7 dind: docker cli 29.0.1, daemon 29.0.1, containerd 2.1.5 and runc 1.3.2 instead of runc 1.3.3 which is our current workaround.
P.S. OS on host is Debian 12, DIND is 23.2.2 as upstream provides it.
The full error is docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: open sysctl net.ipv4.ip_unprivileged_port_start file: unsafe procfs detected: openat2 /proc/./sys/net/ipv4/ip_unprivileged_port_start: invalid cross-device link