podman
podman copied to clipboard
newuidmap Fails with “Operation not permitted” When Running Podman Inside amd64 Podman container on macOS with Rosetta
Issue Description
I am experiencing an issue when trying to run amd64 Podman container inside a amd64 Podman container on macOS with Rosetta. The specific error occurs during the setup of user namespaces with newuidmap, resulting in the following error message:
time="2024-06-19T00:52:32Z" level=error msg="running /usr/bin/newuidmap 12 0 1000 1 1 1 999 1000 1001 64535: newuidmap: write to uid_map failed: Operation not permitted\n"
Error: cannot set up namespace using "/usr/bin/newuidmap": exit status 1
Steps to reproduce the issue
Steps to reproduce the issue
1.Install Podman(5.1.1) on macOS M1.
2. Run the following command:
podman run --arch=amd64 --user podman --privileged quay.io/podman/stable podman run --security-opt label=disable --arch=amd64 ubi8 echo hello
Describe the results you received
- When running the nested Podman command with --arch=amd64, the operation fails with an “Operation not permitted” error during the newuidmap setup.
time="2024-06-19T00:52:32Z" level=error msg="running `/usr/bin/newuidmap 12 0 1000 1 1 1 999 1000 1001 64535`: newuidmap: write to uid_map failed: Operation not permitted\n"
Error: cannot set up namespace using "/usr/bin/newuidmap": exit status 1
- The same setup works correctly with --arch=arm64.
podman run --arch=arm64 --user podman --privileged quay.io/podman/stable podman run --security-opt label=disable --arch=amd64 ubi8 echo hello
Describe the results you expected
The newuidmap should correctly map user IDs without encountering permission issues, allowing Podman to run nested containers with --arch=amd64 on macOS with Rosetta.
podman info output
host:
arch: arm64
buildahVersion: 1.36.0
cgroupControllers:
- cpu
- io
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.10-1.fc40.aarch64
path: /usr/bin/conmon
version: 'conmon version 2.1.10, commit: '
cpuUtilization:
idlePercent: 98.82
systemPercent: 0.25
userPercent: 0.93
cpus: 8
databaseBackend: sqlite
distribution:
distribution: fedora
variant: coreos
version: "40"
eventLogger: journald
freeLocks: 1800
hostname: localhost.localdomain
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 1000000
uidmap:
- container_id: 0
host_id: 502
size: 1
- container_id: 1
host_id: 100000
size: 1000000
kernel: 6.8.11-300.fc40.aarch64
linkmode: dynamic
logDriver: journald
memFree: 5679370240
memTotal: 16703303680
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.11.0-1.20240531102943328308.main.4.g6838c50.fc40.aarch64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.12.0-dev
package: netavark-1.11.0-1.20240606174759319307.main.8.gfebe31a.fc40.aarch64
path: /usr/libexec/podman/netavark
version: netavark 1.12.0-dev
ociRuntime:
name: crun
package: crun-1.15-1.20240607090105650503.main.32.gea54402.fc40.aarch64
path: /usr/bin/crun
version: |-
crun version UNKNOWN
commit: 7cfd0aeb40e4605b6b0ee0afd9cfca80f9c5f68a
rundir: /run/user/502/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20240510.g7288448-1.fc40.aarch64
version: |
pasta 0^20240510.g7288448-1.fc40.aarch64-pasta
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: true
path: /run/user/502/podman/podman.sock
rootlessNetworkCmd: pasta
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: true
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.2-2.fc40.aarch64
version: |-
slirp4netns version 1.2.2
commit: 0ee2d87523e906518d34a6b423271e4826f71faf
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.3
swapFree: 0
swapTotal: 0
uptime: 3h 18m 33.00s (Approximately 0.12 days)
variant: v8
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
search:
- docker.io
store:
configFile: /var/home/core/.config/containers/storage.conf
containerStore:
number: 83
paused: 0
running: 1
stopped: 82
graphDriverName: overlay
graphOptions: {}
graphRoot: /var/home/core/.local/share/containers/storage
graphRootAllocated: 106769133568
graphRootUsed: 35398475776
graphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 266
runRoot: /run/user/502/containers
transientStore: false
volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
APIVersion: 5.1.1
Built: 1717459200
BuiltTime: Mon Jun 3 17:00:00 2024
GitCommit: ""
GoVersion: go1.22.3
Os: linux
OsArch: linux/arm64
Version: 5.1.1
Podman in a container
Yes
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
Additional environment details
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
any ideas here @giuseppe
some weird interaction between binfmt and a file with capabilities (newuidmap).
Could you please share the output of grep . /proc/sys/fs/binfmt_misc/*?
The default binfmt setup doesn't allow setuid binaries https://docs.kernel.org/admin-guide/binfmt-misc.html
C - credentials
Currently, the behavior of binfmt_misc is to calculate the credentials and security token of the new process according to the interpreter. When this flag is included, these attributes are calculated according to the binary. It also implies the O flag. This feature should be used with care as the interpreter will run with root permissions when a setuid binary owned by root is run with binfmt_misc.
We could of course change binfmt_misc configs in machine to set this flag.
If this capability exists, we should take advantage of it. Podman Machines are not expected to have network facing connections so the risk of turning something on like this by default is mitigated, and the downsides of people hitting it are big. One question though would this effect Rosetta based systems?
Yeah changing it for rosetta (x86_64) is simple, https://github.com/containers/podman-machine-os/blob/a52eab8a5fa6790495d90180d69ef94c09f6150e/podman-image-daily/rosetta-activation.sh#L8
We can add the flag there. Where I not sure is how to configure the qemu-user-static scripts for the other arches to make use it of. I would like it to be consistent and not just working with rosetta then.
A friendly reminder that this issue had no activity for 30 days.
There is a PR: https://github.com/containers/podman-machine-os/pull/10
I think this should work now in the 5.2 images.