Volume mounting ownership inconsistent between Docker & Podman
Issue Description
Volume ownership within Podman is different to Docker, and has unexpected behaviors that are detrimental to designing container images that will work on both platforms.
The issue is explained with a test rig at https://github.com/BarDweller/mountperms
Steps to reproduce the issue
The issue is explained with a test rig at https://github.com/BarDweller/mountperms
Describe the results you received
Mounted volume ownership can ignore ownership of mountpoint directories in the container image, and does not behave consistently with Docker when mounting volumes to non-existing mountpoints.
Describe the results you expected
Mounted volume ownership should be predictable, and ideally match Dockers.
podman info output
$ podman info
host:
arch: amd64
buildahVersion: 1.32.0
cgroupControllers:
- cpu
- io
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.10-1.fc39.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.10, commit: '
cpuUtilization:
idlePercent: 97.46
systemPercent: 0.57
userPercent: 1.97
cpus: 2
databaseBackend: boltdb
distribution:
distribution: fedora
variant: workstation
version: "39"
eventLogger: journald
freeLocks: 2031
hostname: PODMANVM
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 524288
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 524288
size: 65536
kernel: 6.7.5-200.fc39.x86_64
linkmode: dynamic
logDriver: journald
memFree: 4063780864
memTotal: 10923556864
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.10.0-1.fc39.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.10.0
package: netavark-1.10.3-1.fc39.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.10.3
ociRuntime:
name: crun
package: crun-1.14.3-1.fc39.x86_64
path: /usr/bin/crun
version: |-
crun version 1.14.3
commit: 1961d211ba98f532ea52d2e80f4c20359f241a98
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20240220.g1e6f92b-1.fc39.x86_64
version: |
pasta 0^20240220.g1e6f92b-1.fc39.x86_64-pasta
Copyright Red Hat
GNU General Public License, version 2 or later
<https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.2-1.fc39.x86_64
version: |-
slirp4netns version 1.2.2
commit: 0ee2d87523e906518d34a6b423271e4826f71faf
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.3
swapFree: 1941434368
swapTotal: 1989144576
uptime: 578h 8m 19.00s (Approximately 24.08 days)
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
127.0.0.1:5000:
Blocked: false
Insecure: true
Location: 127.0.0.1:5000
MirrorByDigestOnly: false
Mirrors: null
Prefix: 127.0.0.1:5000
PullFromMirror: ""
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /home/ajo1/.config/containers/storage.conf
containerStore:
number: 2
paused: 0
running: 1
stopped: 1
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/ajo1/.local/share/containers/storage
graphRootAllocated: 19769851904
graphRootUsed: 19056824320
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Supports shifting: "false"
Supports volatile: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 25
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/ajo1/.local/share/containers/storage/volumes
version:
APIVersion: 4.7.0
Built: 1695838680
BuiltTime: Wed Sep 27 14:18:00 2023
GitCommit: ""
GoVersion: go1.21.1
Os: linux
OsArch: linux/amd64
Version: 4.7.0
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
Additional environment details
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
Can you test with a newer podman?
Behavior looks basically identical with newer code, we haven't substantially changed our logic for volume init in a while.
I'll take this one, but I need to do some research on exactly what Docker does before I start coding - do they unconditionally chown/chmod a volume on mount into a container, or is it only the first time it is mounted into a specific container?
From the tests I've run so far..
The ownership/permissions apply only to the mountpoint, not to the volume content. Eg, if an image has a directory /mountpoint owned by usera:groupa then, regardless of the executing userid of a container based on the image, and regardless of if the volume has been mounted previously to other containers based on different images that gave the volume different ownership, any containers based from that image with a volume mounted at /mountpoint will have /mountpoint within the container owned by usera:groupa
If the image does not contain the directory /mountpoint then with docker, mounting a fresh volume at /mountpoint results in /mountpoint being owned by root within the container. If the volume has been mounted previously to another container, then the ownership of /mountpoint is taken from the last time it was mounted. (Eg, mount a fresh volume to /ownedbya in a container based on an image where /ownedbya was owned by usera:groupa, and then subsequently mount that same volume to a new container where the mountpoint does not exist in the image, and you find the mountpoint /doesnotexist will have ownership usera:groupa)
If a volume is mounted to successive containers, with docker, the mountpoint of the volume in each successive container is based on those rules. Eg, the ownership always obeys the intent of the image of each container. With podman, currently the volume maintains ownership from the first time it was ever mounted.
I haven't done any testing wrt behavior if a volume is mounted concurrently to multiple containers with conflicting ownership/permissions.. I would strongly expect that each container would have their own mountpoint with the ownership/permission dictated by their respective images. In the case where the image doesn't have the mountpoint, I'd expect that to be undefined ;)
Content in the volume is never affected by any of this, if you create files in a volume as root/usera/userb then the files retain their ownership/permissions regardless of which mountpoint they are mounted to. There might be some fun here with setgid on mountpoints in images, and how that behaves if two containers request conflicting ownerships for the same volume..
@BarDweller Any chance you can test https://github.com/containers/podman/pull/22727