sysbox icon indicating copy to clipboard operation
sysbox copied to clipboard

Error when mounting glusterfs mountpoint in inner container

Open dmarteau opened this issue 3 years ago • 8 comments

First, thank you for bringing us Sysbox.

Description of the problem:

We are trying to test some glusterfs configuration using the following:

  • Running 2 containers (from debian:bullseye), clientA and clientB, using sysbox-runc with docker and glusterfs client installed, with a glusterfs mountpoint at /srv/platform.
  • Running two others containers as glusterfs servers (on runc since sysbox-runc prevent using extended attributes)

So far, so good all is working perfectly as expected.

Now we run an inner container inside clientA (or clientB), this container has a mounted volume (type bind) on a glusterfs mountpoint directory: the container fail to start with the following error :

> docker run -it --rm -v /srv/platform:/platform debian:bullseye

docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused:
process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/srv/platform" to rootfs at "/platform"
caused: change mount propagation through procfd: no such file or directory: unknown.

Note:

  1. Binding a volume which is a glusterfs mountpoint mounted on bare metal host in a docker container (using runc) works without problem.
  2. If we mount the parent directory of the glusterfs mountpoint, then we can access the mountpoint and it works as expected (i.e accessing replicated data)

Expected result

Beeing able to bind a glusterfs mountpoint in a sysbox running container into an inner container.

System configuration:

host: Ubuntu 18.04, kernel 5.0.4 container: debian:bullseye sysbox-runc edition: Community Edition (CE) version: 0.4.1 commit: d540126188a1e8595c8f769aeb91833002c37b3a built at: Fri Oct 1 19:33:49 UTC 2021 built by: Rodny Molina oci-specs: 1.0.2-dev docker/runc (host and container): Docker version 20.10.12, build e91ed57 runc version 1.0.2 commit: v1.0.2-0-g52b36a2 spec: 1.0.2-dev go: go1.16.10 libseccomp: 2.5.1

David.

dmarteau avatar Feb 23 '22 17:02 dmarteau

Hi @dmarteau, thanks for giving Sysbox a shot.

A few questions to help us debug:

Running two others containers as glusterfs servers (on runc since sysbox-runc prevent using extended attributes)

Curious on what problem you hit there?

docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/srv/platform" to rootfs at "/platform" caused: change mount propagation through procfd: no such file or directory: unknown.

That error comes from the inner runc, and Sysbox would have little (if anything) to do with it. Can you paste the output of findmnt inside the Sysbox container? I want to see the mount of /srv/platform in it.

If we mount the parent directory of the glusterfs mountpoint, then we can access the mountpoint and it works as expected (i.e accessing replicated data)

That's interesting and will likely provide a clue on the underlying problem.

One additional experiment you could try is to run Docker-in-Docker without Sysbox (i.e., using privileged containers) and see if the problem reproduces.

ctalledo avatar Feb 23 '22 23:02 ctalledo

One more question: are you using shiftfs on this host? (i.e.,lsmod | grep shiftfs).

ctalledo avatar Feb 23 '22 23:02 ctalledo

One more question: are you using shiftfs on this host? (i.e.,lsmod | grep shiftfs).

It seems so:

> lsmod | grep shiftfs
shiftfs                28672  0

dmarteau avatar Feb 24 '22 19:02 dmarteau

Hi @dmarteau,

Can you paste the output of findmnt inside the Sysbox container?

Thanks!

ctalledo avatar Feb 24 '22 23:02 ctalledo

Here is the result of findmnt:

TARGET                                                       SOURCE                                                                                                       FSTYPE         OPTIONS
/                                                            .                                                                                                            shiftfs        rw,relatime
|-/sys                                                       sysfs                                                                                                        sysfs          rw,nosuid,nodev,noexec,relatime
| |-/sys/firmware                                            tmpfs                                                                                                        tmpfs          ro,relatime,uid=100000,gid=100000
| |-/sys/fs/cgroup                                           tmpfs                                                                                                        tmpfs          ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755,uid=100000,gid=100000
| | |-/sys/fs/cgroup/systemd                                 systemd                                                                                                      cgroup         rw,nosuid,nodev,noexec,relatime,xattr,name=systemd
| | |-/sys/fs/cgroup/net_cls,net_prio                        cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,net_cls,net_prio
| | |-/sys/fs/cgroup/cpu,cpuacct                             cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,cpu,cpuacct
| | |-/sys/fs/cgroup/memory                                  cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,memory
| | |-/sys/fs/cgroup/hugetlb                                 cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,hugetlb
| | |-/sys/fs/cgroup/rdma                                    cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,rdma
| | |-/sys/fs/cgroup/blkio                                   cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,blkio
| | |-/sys/fs/cgroup/perf_event                              cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,perf_event
| | |-/sys/fs/cgroup/pids                                    cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,pids
| | |-/sys/fs/cgroup/freezer                                 cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,freezer
| | |-/sys/fs/cgroup/cpuset                                  cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,cpuset
| | `-/sys/fs/cgroup/devices                                 cgroup                                                                                                       cgroup         rw,nosuid,nodev,noexec,relatime,devices
| |-/sys/kernel/config                                       tmpfs                                                                                                        tmpfs          rw,nosuid,nodev,noexec,relatime,size=1024k,uid=100000,gid=100000
| |-/sys/kernel/debug                                        tmpfs                                                                                                        tmpfs          rw,nosuid,nodev,noexec,relatime,size=1024k,uid=100000,gid=100000
| |-/sys/kernel/tracing                                      tmpfs                                                                                                        tmpfs          rw,nosuid,nodev,noexec,relatime,size=1024k,uid=100000,gid=100000
| |-/sys/devices/virtual/dmi/id/product_uuid                 sysboxfs[/sys/devices/virtual/dmi/id/product_uuid]                                                           fuse           rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| `-/sys/module/nf_conntrack/parameters/hashsize             sysboxfs[/sys/module/nf_conntrack/parameters/hashsize]                                                       fuse           rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
|-/proc                                                      proc                                                                                                         proc           rw,nosuid,nodev,noexec,relatime
| |-/proc/bus                                                proc[/bus]                                                                                                   proc           ro,nosuid,nodev,noexec,relatime
| |-/proc/fs                                                 proc[/fs]                                                                                                    proc           ro,nosuid,nodev,noexec,relatime
| |-/proc/irq                                                proc[/irq]                                                                                                   proc           ro,nosuid,nodev,noexec,relatime
| |-/proc/sysrq-trigger                                      proc[/sysrq-trigger]                                                                                         proc           ro,nosuid,nodev,noexec,relatime
| |-/proc/asound                                             tmpfs                                                                                                        tmpfs          ro,relatime,uid=100000,gid=100000
| |-/proc/acpi                                               tmpfs                                                                                                        tmpfs          ro,relatime,uid=100000,gid=100000
| |-/proc/keys                                               udev[/null]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/proc/timer_list                                         udev[/null]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/proc/sched_debug                                        udev[/null]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/proc/scsi                                               tmpfs                                                                                                        tmpfs          ro,relatime,uid=100000,gid=100000
| |-/proc/swaps                                              sysboxfs[/proc/swaps]                                                                                        fuse           rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| |-/proc/sys                                                sysboxfs[/proc/sys]                                                                                          fuse           rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| `-/proc/uptime                                             sysboxfs[/proc/uptime]                                                                                       fuse           rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
|-/dev                                                       tmpfs                                                                                                        tmpfs          rw,nosuid,size=65536k,mode=755,uid=100000,gid=100000
| |-/dev/mqueue                                              mqueue                                                                                                       mqueue         rw,nosuid,nodev,noexec,relatime
| |-/dev/pts                                                 devpts                                                                                                       devpts         rw,nosuid,noexec,relatime,gid=100005,mode=620,ptmxmode=666
| |-/dev/shm                                                 shm                                                                                                          tmpfs          rw,nosuid,nodev,noexec,relatime,size=65536k,uid=100000,gid=100000
| |-/dev/null                                                udev[/null]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/dev/kmsg                                                udev[/null]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/dev/random                                              udev[/random]                                                                                                devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/dev/full                                                udev[/full]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/dev/tty                                                 udev[/tty]                                                                                                   devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/dev/zero                                                udev[/zero]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| |-/dev/urandom                                             udev[/urandom]                                                                                               devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
| `-/dev/fuse                                                udev[/fuse]                                                                                                  devtmpfs       rw,nosuid,relatime,size=16251784k,nr_inodes=4062946,mode=755
|-/run                                                       tmpfs                                                                                                        tmpfs          rw,nosuid,nodev,relatime,size=65536k,mode=755,uid=100000,gid=100000
| |-/run/lock                                                tmpfs                                                                                                        tmpfs          rw,nosuid,nodev,noexec,relatime,size=4096k,uid=100000,gid=100000
| |-/run/docker/netns/ingress_sbox                           nsfs[net:[4026534038]]                                                                                       nsfs           rw
| `-/run/docker/netns/1-n4aj9udyb6                           nsfs[net:[4026534841]]                                                                                       nsfs           rw
|-/srv                                                       /var/lib/sysbox/shiftfs/603c8592-6292-4866-b8f0-d37c363ebfa6                                                 shiftfs        rw,relatime
| |-/srv/platform                                            gluster1:/platform                                                                                           fuse.glusterfs rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072
| `-/srv/lizmap/accounts                                     gluster1:/mutu                                                                                               fuse.glusterfs rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072
|-/etc/resolv.conf                                           /var/lib/sysbox/shiftfs/0aaa571f-6d36-4fd2-a45b-3656ff240fff[/resolv.conf]                                   shiftfs        rw,relatime
|-/etc/hostname                                              /var/lib/sysbox/shiftfs/0aaa571f-6d36-4fd2-a45b-3656ff240fff[/hostname]                                      shiftfs        rw,relatime
|-/etc/hosts                                                 /var/lib/sysbox/shiftfs/0aaa571f-6d36-4fd2-a45b-3656ff240fff[/hosts]                                         shiftfs        rw,relatime
|-/var/lib/docker                                            /dev/nvme0n1p3[/var/lib/sysbox/docker/f4e2ddcd4cb02c7f512579afb9342ba4b6e976f6f7f889b1be15f9a2555d07e6]      ext4           rw,relatime,errors=remount-ro
|-/var/lib/kubelet                                           /dev/nvme0n1p3[/var/lib/sysbox/kubelet/f4e2ddcd4cb02c7f512579afb9342ba4b6e976f6f7f889b1be15f9a2555d07e6]     ext4           rw,relatime,errors=remount-ro
|-/var/lib/rancher/k3s                                       /dev/nvme0n1p3[/var/lib/sysbox/rancher-k3s/f4e2ddcd4cb02c7f512579afb9342ba4b6e976f6f7f889b1be15f9a2555d07e6] ext4           rw,relatime,errors=remount-ro
|-/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs /dev/nvme0n1p3[/var/lib/sysbox/containerd/f4e2ddcd4cb02c7f512579afb9342ba4b6e976f6f7f889b1be15f9a2555d07e6]  ext4           rw,relatime,errors=remount-ro
|-/usr/src/linux-headers-5.4.0-100-generic                   /var/lib/sysbox/shiftfs/65dedd2a-099b-4343-ac6b-9da0358f873b                                                 shiftfs        ro,relatime
|-/usr/src/linux-hwe-5.4-headers-5.4.0-100                   /var/lib/sysbox/shiftfs/12aeb3da-37f4-4125-b394-9166ba93214b                                                 shiftfs        ro,relatime
`-/lib/modules/5.4.0-100-generic                             /var/lib/sysbox/shiftfs/b1a81008-6085-4d6f-946f-583d978eb77d                                                 shiftfs        ro,relatime

You could see the two glusterfs mount

| |-/srv/platform                                            gluster1:/platform                                                                                           fuse.glusterfs rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072
| `-/srv/lizmap/accounts                                     gluster1:/mutu                                                                                               fuse.glusterfs rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072

From the mount command:

gluster1:/platform on /srv/platform type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
gluster1:/mutu on /srv/lizmap/accounts type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

dmarteau avatar Feb 25 '22 14:02 dmarteau

Running two others containers as glusterfs servers (on runc since sysbox-runc prevent using extended attributes)

Curious on what problem you hit there?

We run GlusterFS servers inside container, since glusterfs volumes requires settings extended attributes, tho containers must run as priviliged. If we try run these containers as inner containers in a sysbox we hit a Setting extended attributes failed, reason: Operation not permitted error.

dmarteau avatar Feb 25 '22 17:02 dmarteau

Thanks @dmarteau.

You could see the two glusterfs mount

| |-/srv/platform                                            gluster1:/platform                                                                                           fuse.glusterfs rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072

That looks fine, nothing there would explain the error the inner Docker is reporting (mounting "/srv/platform" to rootfs at "/platform" caused: change mount propagation through procfd: no such file or directory: unknown.)

I think debugging this will require digging into the OCI runc (i.e., the inner Docker's container runtime) to see why it's failing. I doubt the failure is Sysbox related given that the mount of /srv/platform looks fine.

We run GlusterFS servers inside container, since glusterfs volumes requires settings extended attributes, tho containers must run as priviliged. If we try run these containers as inner containers in a sysbox we hit a Setting extended attributes failed, reason: Operation not permitted error.

That sounds like a Sysbox bug we just fixed in the upstream code (and which will be present in the upcoming v0.5.0 release). By the way, running privileged containers inside the Sysbox container is perfectly fine, and they are only privileged within the Sysbox container but not at host level.

ctalledo avatar Feb 25 '22 17:02 ctalledo

Hi @dmarteau, curious if you've had a chance to retry this with the latest Sysbox release (v0.5.0)?

I expect the Setting extended attributes failed, reason: Operation not permitted error to be resolved, but not sure about the other error you reported (i.e., the inner Docker hitting this error: mounting "/srv/platform" to rootfs at "/platform" caused: change mount propagation through procfd: no such file or directory: unknown.).

ctalledo avatar Apr 15 '22 17:04 ctalledo