runc icon indicating copy to clipboard operation
runc copied to clipboard

rootless bind-mount failure for read-only volume with 1.2.[0-4]

Open ck3d opened this issue 10 months ago • 14 comments

Description

Since runc version 1.2 the bind-mount of a read-only volume fails. A strace shows, that a MS_REMOUNT is performed, which failes. MS_REMOUNT was introduced with #3967 The issue raised during updating NixOS from runc version 1.1.15 to 1.2.2, see https://github.com/NixOS/nixpkgs/pull/353610

When volume option ro is given, then the bind mount works as expected.

Steps to reproduce the issue

  1. mount a filesystem with option readonly to e.g. /nix/store
  2. start podman with /nix/store as volume

Describe the results you received and expected

$ tar cv --files-from /dev/null | podman import - scratchimg
$ podman run --runtime=runc -d --name=sleeping -v /nix/store:/bin scratchimg /bin/sleep 10
Error: runc: runc create failed: unable to start container process: error during container init: error mounting "/nix/store" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5020: operation not permitted: OCI permission denied

workaround: $ podman run --runtime=runc -d --name=sleeping -v /nix/store:/bin:ro scratchimg /bin/sleep 10

What version of runc are you using?

1.2.3

Host OS information

ANSI_COLOR="1;34"
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="24.11.20241231.edf04b7"
CPE_NAME="cpe:/o:nixos:nixos:24.11"
DEFAULT_HOSTNAME=nixos
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
ID_LIKE=""
IMAGE_ID=""
IMAGE_VERSION=""
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 24.11 (Vicuna)"
SUPPORT_END="2025-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VARIANT=""
VARIANT_ID=""
VENDOR_NAME=NixOS
VENDOR_URL="https://nixos.org/"
VERSION="24.11 (Vicuna)"
VERSION_CODENAME=vicuna
VERSION_ID="24.11"

Host kernel information

Linux prl 6.6.68 #1-NixOS SMP Fri Dec 27 12:58:58 UTC 2024 aarch64 GNU/Linux

ck3d avatar Jan 02 '25 16:01 ck3d

Is podman setting "rw" explicitly here? If they are just doing "bind" with no additional options then the existing flags should be copied without touching any locked flags. (The change in behaviour was designed to fix some very severe bugs in how we handled clearing flags.)

cyphar avatar Jan 02 '25 17:01 cyphar

No, podman doesn't force a rw.

I can reproduce this issue only with /nix/store and therefor I close this issue. When I create a volume with the same behaviour of /nix/store, then I will raise a new issue.

ck3d avatar Jan 03 '25 11:01 ck3d

It took some time to reproduce the issue. It is important that no instance is running when the bind-mount is performed. Following script can reproduce this issue:

#!/bin/sh -eux

# make sure no podman is running!
pkill podman || true

dir=/tmp/test-ro-volume
mkdir -p "$dir"

sudo mount --bind "$dir" "$dir"
sudo mount -o remount,ro,bind "$dir"

tar cv --files-from /dev/null | podman import - scratchimg

podman run --runtime=runc -d --name=rootless-mount -v "$dir":/bin scratchimg /bin/sh || true

The resulting error is: Error: runc: runc create failed: unable to start container process: error during container init: error mounting "/tmp/test-ro-volume" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5020: operation not permitted: OCI permission denied

ck3d avatar Jan 05 '25 14:01 ck3d

Couldn't repro this on my laptop (kernel 6.12.6-200.fc41.x86_64, podman 5.3.1) with either runc 1.2.3 or from git HEAD. In my case it shows:

Error: runc: runc create failed: unable to start container process: error during container init: exec: "/bin/sh": stat /bin/sh: no such file or directory: OCI runtime attempted to invoke a command that was not found

which, I guess, means the mount was successful.

Supposedly something is fixed either in podman or kernel.

kolyshkin avatar Jan 06 '25 01:01 kolyshkin

Yes, your output shows a successful mount.

I updated my aarch64 system to kernel 6.12.7 and podman 5.3.1 and get still the mount issue.

ck3d avatar Jan 06 '25 12:01 ck3d

Are there any differences in the mounts? Here's mine:

[kir@kir-tp1 runc]$ grep /tmp /proc/self/mountinfo 
50 77 0:46 / /tmp rw,nosuid,nodev shared:83 - tmpfs tmpfs rw,seclabel,size=32766188k,nr_inodes=1048576,inode64
1112 50 0:46 /test-ro-volume /tmp/test-ro-volume ro,nosuid,nodev shared:83 - tmpfs tmpfs rw,seclabel,size=32766188k,nr_inodes=1048576,inode64

kolyshkin avatar Jan 06 '25 21:01 kolyshkin

Yes, I have no tmpfs mounted at /tmp. My /tmp is part of /: 74 1 8:1 / / rw,relatime shared:1 - ext4 /dev/sda1 rw

ck3d avatar Jan 07 '25 06:01 ck3d

I changed my system to use tmpfs for /tmp: 52 74 0:46 / /tmp rw,nosuid,nodev shared:84 - tmpfs tmpfs rw

But the mount still failes: Error: runc: runc create failed: unable to start container process: error during container init: error mounting "/tmp/test-ro-volume" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5026: operation not permitted: OCI permission denied

ck3d avatar Jan 07 '25 11:01 ck3d

@cyphar I'd like to help with this one, but I still can't reproduce this no matter how I try :(

Update: sorry, I meant to mention @ck3d

kolyshkin avatar Feb 07 '25 02:02 kolyshkin

@ck3d I'd like to help with this one, but I still can't reproduce this no matter how I try :(

kolyshkin avatar Feb 07 '25 05:02 kolyshkin

Yeah, I was trying to figure out how to reproduce this but couldn't figure it out. I was going to install NixOS to double-check but I didn't have time...

cyphar avatar Feb 08 '25 02:02 cyphar

I could reproduce the issue on Ubuntu 24.04.1 and the latest precompiled runc 1.2.4 (from release page) on an aarch64. I executed the example from above and used following run command:

parallels@ubuntu-linux-2404:~$ podman run --runtime=/usr/local/sbin/runc -d --name=rootless-mount -v "$dir":/bin scratchimg /bin/sh
Error: /usr/local/sbin/runc: runc create failed: unable to start container process: error during container init: error mounting "/tmp/test-ro-volume" to rootfs at "/bin": mount dst=/bin, dstFd=/proc/thread-self/fd/8, flags=0x5020: operation not permitted: OCI permission denied

Ubuntus podman version is 4.9.3

ck3d avatar Feb 09 '25 16:02 ck3d

The script you posted works on openSUSE Tumbleweed. I'll test this in a VM...

In your original comment you said you had an strace log of the failure -- can you attach it here (preferably with -yy -s 512 or something to make sure everything is fully printed)?

cyphar avatar Feb 14 '25 01:02 cyphar

I create a strace.log with following command:

strace -yy -s 512 podman run --runtime=/usr/local/sbin/runc -d --name=rootless-mount -v /tmp/test-ro-volume:/bin scratchimg /bin/sh 2> strace.log

Thanks for helping.

ck3d avatar Feb 15 '25 08:02 ck3d

No, podman doesn't force a rw.

It does force rw before 5.6.0, see containers/podman#25942.

wegank avatar Aug 18 '25 09:08 wegank