runc
runc copied to clipboard
rootless bind-mount failure for read-only volume with 1.2.[0-4]
Description
Since runc version 1.2 the bind-mount of a read-only volume fails. A strace shows, that a MS_REMOUNT is performed, which failes. MS_REMOUNT was introduced with #3967 The issue raised during updating NixOS from runc version 1.1.15 to 1.2.2, see https://github.com/NixOS/nixpkgs/pull/353610
When volume option ro is given, then the bind mount works as expected.
Steps to reproduce the issue
- mount a filesystem with option readonly to e.g. /nix/store
- start podman with /nix/store as volume
Describe the results you received and expected
$ tar cv --files-from /dev/null | podman import - scratchimg
$ podman run --runtime=runc -d --name=sleeping -v /nix/store:/bin scratchimg /bin/sleep 10
Error: runc: runc create failed: unable to start container process: error during container init: error mounting "/nix/store" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5020: operation not permitted: OCI permission denied
workaround:
$ podman run --runtime=runc -d --name=sleeping -v /nix/store:/bin:ro scratchimg /bin/sleep 10
What version of runc are you using?
1.2.3
Host OS information
ANSI_COLOR="1;34"
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="24.11.20241231.edf04b7"
CPE_NAME="cpe:/o:nixos:nixos:24.11"
DEFAULT_HOSTNAME=nixos
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
ID_LIKE=""
IMAGE_ID=""
IMAGE_VERSION=""
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 24.11 (Vicuna)"
SUPPORT_END="2025-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VARIANT=""
VARIANT_ID=""
VENDOR_NAME=NixOS
VENDOR_URL="https://nixos.org/"
VERSION="24.11 (Vicuna)"
VERSION_CODENAME=vicuna
VERSION_ID="24.11"
Host kernel information
Linux prl 6.6.68 #1-NixOS SMP Fri Dec 27 12:58:58 UTC 2024 aarch64 GNU/Linux
Is podman setting "rw" explicitly here? If they are just doing "bind" with no additional options then the existing flags should be copied without touching any locked flags. (The change in behaviour was designed to fix some very severe bugs in how we handled clearing flags.)
No, podman doesn't force a rw.
I can reproduce this issue only with /nix/store and therefor I close this issue. When I create a volume with the same behaviour of /nix/store, then I will raise a new issue.
It took some time to reproduce the issue. It is important that no instance is running when the bind-mount is performed. Following script can reproduce this issue:
#!/bin/sh -eux
# make sure no podman is running!
pkill podman || true
dir=/tmp/test-ro-volume
mkdir -p "$dir"
sudo mount --bind "$dir" "$dir"
sudo mount -o remount,ro,bind "$dir"
tar cv --files-from /dev/null | podman import - scratchimg
podman run --runtime=runc -d --name=rootless-mount -v "$dir":/bin scratchimg /bin/sh || true
The resulting error is:
Error: runc: runc create failed: unable to start container process: error during container init: error mounting "/tmp/test-ro-volume" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5020: operation not permitted: OCI permission denied
Couldn't repro this on my laptop (kernel 6.12.6-200.fc41.x86_64, podman 5.3.1) with either runc 1.2.3 or from git HEAD. In my case it shows:
Error: runc: runc create failed: unable to start container process: error during container init: exec: "/bin/sh": stat /bin/sh: no such file or directory: OCI runtime attempted to invoke a command that was not found
which, I guess, means the mount was successful.
Supposedly something is fixed either in podman or kernel.
Yes, your output shows a successful mount.
I updated my aarch64 system to kernel 6.12.7 and podman 5.3.1 and get still the mount issue.
Are there any differences in the mounts? Here's mine:
[kir@kir-tp1 runc]$ grep /tmp /proc/self/mountinfo
50 77 0:46 / /tmp rw,nosuid,nodev shared:83 - tmpfs tmpfs rw,seclabel,size=32766188k,nr_inodes=1048576,inode64
1112 50 0:46 /test-ro-volume /tmp/test-ro-volume ro,nosuid,nodev shared:83 - tmpfs tmpfs rw,seclabel,size=32766188k,nr_inodes=1048576,inode64
Yes, I have no tmpfs mounted at /tmp. My /tmp is part of /: 74 1 8:1 / / rw,relatime shared:1 - ext4 /dev/sda1 rw
I changed my system to use tmpfs for /tmp:
52 74 0:46 / /tmp rw,nosuid,nodev shared:84 - tmpfs tmpfs rw
But the mount still failes:
Error: runc: runc create failed: unable to start container process: error during container init: error mounting "/tmp/test-ro-volume" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5026: operation not permitted: OCI permission denied
@cyphar I'd like to help with this one, but I still can't reproduce this no matter how I try :(
Update: sorry, I meant to mention @ck3d
@ck3d I'd like to help with this one, but I still can't reproduce this no matter how I try :(
Yeah, I was trying to figure out how to reproduce this but couldn't figure it out. I was going to install NixOS to double-check but I didn't have time...
I could reproduce the issue on Ubuntu 24.04.1 and the latest precompiled runc 1.2.4 (from release page) on an aarch64. I executed the example from above and used following run command:
parallels@ubuntu-linux-2404:~$ podman run --runtime=/usr/local/sbin/runc -d --name=rootless-mount -v "$dir":/bin scratchimg /bin/sh
Error: /usr/local/sbin/runc: runc create failed: unable to start container process: error during container init: error mounting "/tmp/test-ro-volume" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5020: operation not permitted: OCI permission denied
Ubuntus podman version is 4.9.3
The script you posted works on openSUSE Tumbleweed. I'll test this in a VM...
In your original comment you said you had an strace log of the failure -- can you attach it here (preferably with -yy -s 512 or something to make sure everything is fully printed)?
I create a strace.log with following command:
strace -yy -s 512 podman run --runtime=/usr/local/sbin/runc -d --name=rootless-mount -v /tmp/test-ro-volume:/bin scratchimg /bin/sh 2> strace.log
Thanks for helping.
No, podman doesn't force a rw.
It does force rw before 5.6.0, see containers/podman#25942.