podman
podman copied to clipboard
Mount point ownership not consistent with Docker's behaviour
Issue Description
When mounting a volume into a container the mountpoint directory should preserve it's ownership. This seems to work only for very first run/mount. Subsequent mounts have altered ownership of mountpoints directory (to the ownership set by first mounter).
This happens at least podman v4.3 -- v4.6.
Steps to reproduce the issue
Run following script against podman docker compat socket:
#!/bin/sh
set -e
cat <<EOF > Dockerfile.usera
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1001:1002 /workspace
USER 1001:1002
EOF
cat <<EOF > Dockerfile.userb
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1003:1004 /workspace
USER 1003:1004
EOF
docker build -q . -f Dockerfile.usera -t alpine-usera
docker build -q . -f Dockerfile.userb -t alpine-userb
docker volume rm test-volume || true
docker run --rm -v test-volume:/workspace alpine-usera sh -c 'echo done'
# command below fails on podman because of permissions
docker run --rm -v test-volume:/workspace alpine-userb sh -c 'touch /workspace/b'
docker volume rm test-volume || true
Describe the results you received
The script exits with non-zero exit code and error message.
touch: /workspace/b: Permission denied
Describe the results you expected
The script exits with 0 exit code.
podman info output
host:
arch: amd64
buildahVersion: 1.31.2
cgroupControllers:
- cpu
- io
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.7-2.fc37.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.7, commit: '
cpuUtilization:
idlePercent: 96.82
systemPercent: 0.74
userPercent: 2.44
cpus: 16
databaseBackend: boltdb
distribution:
distribution: fedora
variant: workstation
version: "37"
eventLogger: journald
freeLocks: 2011
hostname: rigel
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.4.9-100.fc37.x86_64
linkmode: dynamic
logDriver: journald
memFree: 942694400
memTotal: 67101196288
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.7.0-1.fc37.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.7.0
package: netavark-1.7.0-1.fc37.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.7.0
ociRuntime:
name: crun
package: crun-1.8.6-1.fc37.x86_64
path: /usr/bin/crun
version: |-
crun version 1.8.6
commit: 73f759f4a39769f60990e7d225f561b4f4f06bcf
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
os: linux
pasta:
executable: /usr/bin/pasta
package: passt-0^20230625.g32660ce-1.fc37.x86_64
version: |
pasta 0^20230625.g32660ce-1.fc37.x86_64
Copyright Red Hat
GNU Affero GPL version 3 or later <https://www.gnu.org/licenses/agpl-3.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
remoteSocket:
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.0-8.fc37.x86_64
version: |-
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.3
swapFree: 1392640
swapTotal: 8589930496
uptime: 89h 60m 31.00s (Approximately 3.71 days)
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
- ipvlan
volume:
- local
registries:
docker.io:
Blocked: false
Insecure: false
Location: docker.io
MirrorByDigestOnly: false
Mirrors:
- Insecure: false
Location: quay.io/mvasek
PullFromMirror: ""
Prefix: docker.io
PullFromMirror: ""
localhost:50000:
Blocked: false
Insecure: true
Location: localhost:50000
MirrorByDigestOnly: false
Mirrors: null
Prefix: localhost:50000
PullFromMirror: ""
search:
- example.com
store:
configFile: /home/mvasek/.config/containers/storage.conf
containerStore:
number: 10
paused: 0
running: 1
stopped: 9
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/mvasek/.local/share/containers/storage
graphRootAllocated: 1022488477696
graphRootUsed: 409338372096
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 65
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/mvasek/.local/share/containers/storage/volumes
version:
APIVersion: 4.6.2-dev
Built: 1692205486
BuiltTime: Wed Aug 16 19:04:46 2023
GitCommit: 8183ba8b256442910154d4d264deac9d12242eae
GoVersion: go1.20.2
Os: linux
OsArch: linux/amd64
Version: 4.6.2-dev
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
I tested this on rootless but I believe the same thing happens for privileged too.
Additional information
Happens always.
I see the ownership is kept if you delete the volume before running the second command.
Does docker automatically delete the volume when the first container exits?
@giuseppe no the volume persists.
Another way to reproduce: try building an app using pack CLI with podman and untrusted builder.
@giuseppe but you might be onto something: the ownership behaves differently the moment I try to write something into the volume.
Maybe I isolated the bug in wrong way, but there's definitely some issues with volume mounting. The pack CLI does application build in multiple containers that share some data via volumes. With Docker it works with podman it fails because of ownership issues.
- Create simple Go app (e.g hello world).
- Run
pack build my-go-app -Bghcr.io/knative/builder-jammy-full:latest --docker-host=inherit --trust-builder=0. - Build fails because permission on shared volume.
@giuseppe try running:
#!/bin/sh
set -e
cat <<EOF > Dockerfile.usera
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1001:1002 /workspace
USER 1001:1002
EOF
cat <<EOF > Dockerfile.userb
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1003:1004 /workspace
USER 1003:1004
EOF
docker build -q . -f Dockerfile.usera -t alpine-usera
docker build -q . -f Dockerfile.userb -t alpine-userb
docker volume rm test-volume || true
docker run --rm -v test-volume:/workspace alpine-usera sh -c 'echo done'
docker run --rm -v test-volume:/workspace alpine-userb sh -c 'touch /workspace/b'
docker volume rm test-volume || true
With docker it works but on podman it fails.
@giuseppe note that if the first container actually tried to write to /workspace/ it would fail with Moby too. But in our usecase the first container uses the volume as read only. Although it may not declare it via :ro
@mheon do we need to change ownership every time we use the volume in a container?
I have to assume we added that code for a reason, but I can't recall exactly why. Almost certainly a bugfix, but exactly what was being fixed is unclear. The exact on-mount behavior for volumes versus Docker has been a persistent problem.
fyi in the past even the very first mounting container had bad ownership, see https://github.com/containers/podman/pull/10905
I actually cannot find an explicit chown of the volume mountpoint anywhere in the mount code. So I'm actually not 100% on where this is being done; it may be an unintentional side-effect of another chown doing something else?
Looks like the chown is called only when volume if brand new -- created together with a new container.
https://github.com/containers/podman/blob/5ea019419cd78457230cf4d15ee459bf4288a1bd/libpod/container_internal_common.go#L2873-L2881
wrt:
docker run --rm -v test-volume:/workspace alpine-usera sh -c 'echo done'
docker run --rm -v test-volume:/workspace alpine-userb sh -c 'touch /workspace/b'
It appears that chow is called only for the first container.
there is some state vol.state.NeedsChown
I assume this ensures that chown is done once?
The vol.state.NeedsChown seems to be set on the first chown done by the first container, so subsequent containers won't chown it.
@giuseppe how important is vol.state.NeedsChown?
https://github.com/containers/podman/blob/5ea019419cd78457230cf4d15ee459bf4288a1bd/libpod/container_internal_common.go#L2837-L2838
Ah, ok, per @matejvasek it's fixVolumePermissions()
Looking further, it's tied to a bool, NeedsChown, in the volume config. Set to true at volume create, false once permissions have been fixed during first mount into a container. Dropping the bool entirely and making the chown unconditional ought to fix this?
@mheon I believe it will fix the issue, but I don't know if it could have any adverse effects.
It is not doing a recursive chown, correct? I think the goal there was to make sure the volume is owned by the primart user of the container. I think I had a PR on this code at one point to attempt to change it, but I gave up. https://github.com/containers/podman/pull/16782
make sure the volume is owned by the primart user of the container.
Small correction: primary user uid/gid is used only if the mount point does not already exist in the container. If the mount point exist (as directory) then uid/gid of the directory shall be used.
Setting ownership just once makes sense if you assume that the volume will be always used just by single container.
However that's not my case the pack CLI runs multiple containers in sequence on one volume.
If there's a reason we added this originally, it's Docker compat. If Docker doesn't do the same thing, they that reason is not valid.
If there's a reason we added this originally, it's Docker compat. If Docker doesn't do the same thing, they that reason is not valid.
What you mean by this here? The fact that we do chown, or the fact that we do it only once?
Only once. There's no reason we'd add such complexity other than to match Docker
Looks like NeedsChown was introduced in https://github.com/containers/podman/pull/6747 which was fixing https://github.com/containers/podman/issues/5698.
The issue does not directly mention Docker.
Also --userns that was supposed to be fixed does not even exist on Docker. So I believe that NeedsChown has nothing to do with Docker compatibility.