podman icon indicating copy to clipboard operation
podman copied to clipboard

Mount point ownership not consistent with Docker's behaviour

Open matejvasek opened this issue 2 years ago • 57 comments
trafficstars

Issue Description

When mounting a volume into a container the mountpoint directory should preserve it's ownership. This seems to work only for very first run/mount. Subsequent mounts have altered ownership of mountpoints directory (to the ownership set by first mounter).

This happens at least podman v4.3 -- v4.6.

Steps to reproduce the issue

Run following script against podman docker compat socket:

#!/bin/sh

set -e

cat <<EOF > Dockerfile.usera
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1001:1002 /workspace
USER 1001:1002
EOF

cat <<EOF > Dockerfile.userb
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1003:1004 /workspace
USER 1003:1004
EOF

docker build -q . -f Dockerfile.usera -t alpine-usera
docker build -q . -f Dockerfile.userb -t alpine-userb
docker volume rm test-volume || true
docker run --rm -v test-volume:/workspace alpine-usera sh -c 'echo done'
# command below fails on podman because of permissions
docker run --rm -v test-volume:/workspace alpine-userb sh -c 'touch /workspace/b'
docker volume rm test-volume || true

Describe the results you received

The script exits with non-zero exit code and error message.

touch: /workspace/b: Permission denied

Describe the results you expected

The script exits with 0 exit code.

podman info output

host:
  arch: amd64
  buildahVersion: 1.31.2
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 96.82
    systemPercent: 0.74
    userPercent: 2.44
  cpus: 16
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: workstation
    version: "37"
  eventLogger: journald
  freeLocks: 2011
  hostname: rigel
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.4.9-100.fc37.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 942694400
  memTotal: 67101196288
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.7.0-1.fc37.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.7.0
    package: netavark-1.7.0-1.fc37.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.7.0
  ociRuntime:
    name: crun
    package: crun-1.8.6-1.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.6
      commit: 73f759f4a39769f60990e7d225f561b4f4f06bcf
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20230625.g32660ce-1.fc37.x86_64
    version: |
      pasta 0^20230625.g32660ce-1.fc37.x86_64
      Copyright Red Hat
      GNU Affero GPL version 3 or later <https://www.gnu.org/licenses/agpl-3.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 1392640
  swapTotal: 8589930496
  uptime: 89h 60m 31.00s (Approximately 3.71 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: docker.io
    MirrorByDigestOnly: false
    Mirrors:
    - Insecure: false
      Location: quay.io/mvasek
      PullFromMirror: ""
    Prefix: docker.io
    PullFromMirror: ""
  localhost:50000:
    Blocked: false
    Insecure: true
    Location: localhost:50000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: localhost:50000
    PullFromMirror: ""
  search:
  - example.com
store:
  configFile: /home/mvasek/.config/containers/storage.conf
  containerStore:
    number: 10
    paused: 0
    running: 1
    stopped: 9
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/mvasek/.local/share/containers/storage
  graphRootAllocated: 1022488477696
  graphRootUsed: 409338372096
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 65
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/mvasek/.local/share/containers/storage/volumes
version:
  APIVersion: 4.6.2-dev
  Built: 1692205486
  BuiltTime: Wed Aug 16 19:04:46 2023
  GitCommit: 8183ba8b256442910154d4d264deac9d12242eae
  GoVersion: go1.20.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.6.2-dev

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

I tested this on rootless but I believe the same thing happens for privileged too.

Additional information

Happens always.

matejvasek avatar Aug 16 '23 19:08 matejvasek

I see the ownership is kept if you delete the volume before running the second command.

Does docker automatically delete the volume when the first container exits?

giuseppe avatar Aug 17 '23 09:08 giuseppe

@giuseppe no the volume persists.

matejvasek avatar Aug 17 '23 11:08 matejvasek

Another way to reproduce: try building an app using pack CLI with podman and untrusted builder.

matejvasek avatar Aug 17 '23 11:08 matejvasek

@giuseppe but you might be onto something: the ownership behaves differently the moment I try to write something into the volume.

matejvasek avatar Aug 17 '23 12:08 matejvasek

Maybe I isolated the bug in wrong way, but there's definitely some issues with volume mounting. The pack CLI does application build in multiple containers that share some data via volumes. With Docker it works with podman it fails because of ownership issues.

matejvasek avatar Aug 17 '23 12:08 matejvasek

  1. Create simple Go app (e.g hello world).
  2. Run pack build my-go-app -Bghcr.io/knative/builder-jammy-full:latest --docker-host=inherit --trust-builder=0.
  3. Build fails because permission on shared volume.

matejvasek avatar Aug 17 '23 12:08 matejvasek

@giuseppe try running:

#!/bin/sh

set -e

cat <<EOF > Dockerfile.usera
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1001:1002 /workspace
USER 1001:1002
EOF

cat <<EOF > Dockerfile.userb
FROM alpine
USER root
RUN mkdir -p /workspace
RUN chown 1003:1004 /workspace
USER 1003:1004
EOF

docker build -q . -f Dockerfile.usera -t alpine-usera
docker build -q . -f Dockerfile.userb -t alpine-userb
docker volume rm test-volume || true
docker run --rm -v test-volume:/workspace alpine-usera sh -c 'echo done'
docker run --rm -v test-volume:/workspace alpine-userb sh -c 'touch /workspace/b'
docker volume rm test-volume || true

With docker it works but on podman it fails.

matejvasek avatar Aug 17 '23 14:08 matejvasek

@giuseppe note that if the first container actually tried to write to /workspace/ it would fail with Moby too. But in our usecase the first container uses the volume as read only. Although it may not declare it via :ro

matejvasek avatar Aug 17 '23 14:08 matejvasek

@mheon do we need to change ownership every time we use the volume in a container?

giuseppe avatar Aug 17 '23 15:08 giuseppe

I have to assume we added that code for a reason, but I can't recall exactly why. Almost certainly a bugfix, but exactly what was being fixed is unclear. The exact on-mount behavior for volumes versus Docker has been a persistent problem.

mheon avatar Aug 17 '23 15:08 mheon

fyi in the past even the very first mounting container had bad ownership, see https://github.com/containers/podman/pull/10905

matejvasek avatar Aug 17 '23 16:08 matejvasek

I actually cannot find an explicit chown of the volume mountpoint anywhere in the mount code. So I'm actually not 100% on where this is being done; it may be an unintentional side-effect of another chown doing something else?

mheon avatar Aug 17 '23 16:08 mheon

Looks like the chown is called only when volume if brand new -- created together with a new container.

matejvasek avatar Aug 17 '23 16:08 matejvasek

https://github.com/containers/podman/blob/5ea019419cd78457230cf4d15ee459bf4288a1bd/libpod/container_internal_common.go#L2873-L2881

matejvasek avatar Aug 17 '23 16:08 matejvasek

wrt:

docker run --rm -v test-volume:/workspace alpine-usera sh -c 'echo done'
docker run --rm -v test-volume:/workspace alpine-userb sh -c 'touch /workspace/b'

It appears that chow is called only for the first container.

matejvasek avatar Aug 17 '23 16:08 matejvasek

there is some state vol.state.NeedsChown I assume this ensures that chown is done once?

matejvasek avatar Aug 17 '23 16:08 matejvasek

The vol.state.NeedsChown seems to be set on the first chown done by the first container, so subsequent containers won't chown it.

matejvasek avatar Aug 17 '23 16:08 matejvasek

@giuseppe how important is vol.state.NeedsChown?

matejvasek avatar Aug 17 '23 16:08 matejvasek

https://github.com/containers/podman/blob/5ea019419cd78457230cf4d15ee459bf4288a1bd/libpod/container_internal_common.go#L2837-L2838

matejvasek avatar Aug 17 '23 16:08 matejvasek

Ah, ok, per @matejvasek it's fixVolumePermissions()

Looking further, it's tied to a bool, NeedsChown, in the volume config. Set to true at volume create, false once permissions have been fixed during first mount into a container. Dropping the bool entirely and making the chown unconditional ought to fix this?

mheon avatar Aug 17 '23 16:08 mheon

@mheon I believe it will fix the issue, but I don't know if it could have any adverse effects.

matejvasek avatar Aug 17 '23 16:08 matejvasek

It is not doing a recursive chown, correct? I think the goal there was to make sure the volume is owned by the primart user of the container. I think I had a PR on this code at one point to attempt to change it, but I gave up. https://github.com/containers/podman/pull/16782

rhatdan avatar Aug 17 '23 17:08 rhatdan

make sure the volume is owned by the primart user of the container.

Small correction: primary user uid/gid is used only if the mount point does not already exist in the container. If the mount point exist (as directory) then uid/gid of the directory shall be used.

matejvasek avatar Aug 17 '23 17:08 matejvasek

Setting ownership just once makes sense if you assume that the volume will be always used just by single container. However that's not my case the pack CLI runs multiple containers in sequence on one volume.

matejvasek avatar Aug 17 '23 17:08 matejvasek

If there's a reason we added this originally, it's Docker compat. If Docker doesn't do the same thing, they that reason is not valid.

mheon avatar Aug 17 '23 18:08 mheon

If there's a reason we added this originally, it's Docker compat. If Docker doesn't do the same thing, they that reason is not valid.

What you mean by this here? The fact that we do chown, or the fact that we do it only once?

matejvasek avatar Aug 17 '23 18:08 matejvasek

Only once. There's no reason we'd add such complexity other than to match Docker

mheon avatar Aug 17 '23 19:08 mheon

Looks like NeedsChown was introduced in https://github.com/containers/podman/pull/6747 which was fixing https://github.com/containers/podman/issues/5698.

matejvasek avatar Aug 18 '23 16:08 matejvasek

The issue does not directly mention Docker.

matejvasek avatar Aug 18 '23 16:08 matejvasek

Also --userns that was supposed to be fixed does not even exist on Docker. So I believe that NeedsChown has nothing to do with Docker compatibility.

matejvasek avatar Aug 19 '23 00:08 matejvasek