sysbox icon indicating copy to clipboard operation
sysbox copied to clipboard

Cannot mount VM shared folders inside docker system container

Open Lucky1313 opened this issue 2 years ago • 4 comments

I guess I'm not 100% sure this is actually a supported use case, but I couldn't find evidence either way.

My usage for sysbox is to be able to run a VM (using vagrant) which can then use sysbox to run system containers, that can then be used to test my application running inside the system container (that needs to use docker and systemd directly). Right now running on an Ubuntu machine, but the thinking with the VM is to be able to support using a MacOS host as well.

Everything seems to work, except for setting correct IDs for bind mounts (from VM filesystem to inner docker filesystem). Looking at the logs for sysbox-mgr, it seems like none of the methods for ID-mapping for mounts succeed:

Jul 11 20:09:09 ubuntu2204.localdomain systemd[1]: Starting sysbox-mgr (part of the Sysbox container runtime)...
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Starting ..."
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Sysbox data root: /var/lib/sysbox"
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Shiftfs module found in kernel: yes"
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Shiftfs works properly: no"
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Shiftfs-on-overlayfs works properly: no"
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="ID-mapped mounts supported by kernel: yes"
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Overlayfs on ID-mapped mounts supported by kernel: no"
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Operating in system container mode."
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Inner container image preloading enabled."
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Listening on /run/sysbox/sysmgr.sock"
Jul 11 20:09:09 ubuntu2204.localdomain systemd[1]: Started sysbox-mgr (part of the Sysbox container runtime).
Jul 11 20:09:09 ubuntu2204.localdomain sysbox-mgr[9948]: time="2023-07-11 20:09:09" level=info msg="Ready ..."

Tested using both virtualbox and libvirt (kvm) virtualization for vagrant.

Host Machine

Ubuntu 22.04 with 5.17 kernel

VM Machine

Ubuntu 22.04 with 5.15 kernel + ShiftFS installed Sysbox version 0.6.1 Docker version 24.0.4

Recreate

# Using virtualbox
$ vagrant up
# Using libvirt
$ vagrant up --provider=libvirt
$ vagrant ssh
# Inside VM
$ docker run --runtime=sysbox-runc -it -v /home/vagrant:/root/ws ubuntu:focal /bin/bash
# Inside Container
$ ls -la /root/
# Should see
total 24
drwx------ 1 root   root    4096 Jul 11 20:41 .
drwxr-xr-x 1 root   root    4096 Jul 11 20:41 ..
-rw-r--r-- 1 root   root    3106 Dec  5  2019 .bashrc
-rw-r--r-- 1 root   root     161 Dec  5  2019 .profile
drwxr-x--- 6 nobody nogroup 4096 Jul 11 20:11 ws

This will happen for mounting folders from the VM disk into the docker container, as well as for synced VM folders (through either virtualbox or libvirtio-fs).

Vagrantfile (libvirt)
$script = <<-SCRIPT
set -euxo pipefail
export DEBIAN_FRONTEND=noninteractive

echo "Installing docker"
apt-get update
apt-get install -y ca-certificates curl gnupg
mkdir -p /etc/apt/keyrings/
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --batch --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg
echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" > \
  /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
usermod -a -G docker vagrant

# Install shiftfs
apt-get install -y make dkms git wget
git clone -b k5.16 https://github.com/toby63/shiftfs-dkms.git shiftfs-k516
pushd shiftfs-k516
./update1
make -f Makefile.dkms
modinfo shiftfs
popd

echo "Installing sysbox"
# From https://github.com/nestybox/sysbox/blob/master/docs/user-guide/install-package.md
wget -q https://downloads.nestybox.com/sysbox/releases/v0.6.1/sysbox-ce_0.6.1-0.linux_amd64.deb
apt-get install -y jq
apt-get install -y ./sysbox-ce_0.6.1-0.linux_amd64.deb
rm sysbox-ce_0.6.1-0.linux_amd64.deb

echo "Done"
SCRIPT


Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu2204"

  config.vm.provision "shell", inline: $script

  config.vm.synced_folder "./", "/home/vagrant/ws/", type: "virtiofs"

  config.ssh.forward_agent = true
  
  config.vm.provider :libvirt do |libvirt|
    libvirt.cpus = 4
    libvirt.memory = 8192
    libvirt.memorybacking :access, :mode => "shared"
  end
end

Vagrantfile (virtualbox)
$script = <<-SCRIPT
set -euxo pipefail
export DEBIAN_FRONTEND=noninteractive

echo "Installing docker"
apt-get update
apt-get install -y ca-certificates curl gnupg
mkdir -p /etc/apt/keyrings/
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --batch --dearmor -o /etc/apt/keyrings/docker.gpg
chmod a+r /etc/apt/keyrings/docker.gpg
echo \
  "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" > \
  /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
usermod -a -G docker vagrant

# Install shiftfs
apt-get install -y make dkms git wget
git clone -b k5.16 https://github.com/toby63/shiftfs-dkms.git shiftfs-k516
pushd shiftfs-k516
./update1
make -f Makefile.dkms
modinfo shiftfs
popd

echo "Installing sysbox"
# From https://github.com/nestybox/sysbox/blob/master/docs/user-guide/install-package.md
wget -q https://downloads.nestybox.com/sysbox/releases/v0.6.1/sysbox-ce_0.6.1-0.linux_amd64.deb
apt-get install -y jq
apt-get install -y ./sysbox-ce_0.6.1-0.linux_amd64.deb
rm sysbox-ce_0.6.1-0.linux_amd64.deb

echo "Done"
SCRIPT


Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/jammy64"

  config.vm.provision "shell", inline: $script

  config.vm.synced_folder "./", "/home/vagrant/ws/"

  config.ssh.forward_agent = true

  config.vm.provider "virtualbox" do |v|
    v.memory = 4096
    v.cpus = 8
  end
end

Lucky1313 avatar Jul 11 '23 20:07 Lucky1313

@Lucky1313, thanks for reporting this one. Haven't looked in detail yet, but yes, this scenario (vagrant) is fully supported, that's actually what I use most of the time.

Before we spend any time on this, could you please try to install the latest sysbox release (v0.6.2) as there are important enhancements in this area?

rodnymolina avatar Jul 11 '23 21:07 rodnymolina

Sorry about having the wrong version, but confirmed issue is still present on v0.6.2. It does look like the detection of host capabilities is different:

Jul 11 22:33:16 ubuntu-jammy systemd[1]: Starting sysbox-mgr (part of the Sysbox container runtime)...
Jul 11 22:33:16 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:16" level=info msg="Starting ..."
Jul 11 22:33:16 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:16" level=info msg="Sysbox data root: /var/lib/sysbox"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Shiftfs module found in kernel: yes"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Shiftfs works properly: yes"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Shiftfs-on-overlayfs works properly: yes"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="ID-mapped mounts supported by kernel: yes"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Overlayfs on ID-mapped mounts supported by kernel: no"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Operating in system container mode."
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Inner container image preloading enabled."
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Listening on /run/sysbox/sysmgr.sock"
Jul 11 22:33:17 ubuntu-jammy sysbox-mgr[5359]: time="2023-07-11 22:33:17" level=info msg="Ready ..."
Jul 11 22:33:17 ubuntu-jammy systemd[1]: Started sysbox-mgr (part of the Sysbox container runtime).

But bind mounted folder still has nobody:nogroup IDs.

Sysbox versions:

Jul 11 22:33:17 ubuntu-jammy systemd[1]: Started Sysbox container runtime.
Jul 11 22:33:17 ubuntu-jammy sh[5382]: sysbox-runc
Jul 11 22:33:17 ubuntu-jammy sh[5382]:         edition:         Community Edition (CE)
Jul 11 22:33:17 ubuntu-jammy sh[5382]:         version:         0.6.2
Jul 11 22:33:17 ubuntu-jammy sh[5382]:         commit:         60ca93c783b19c63581e34aa183421ce0b9b26b7
Jul 11 22:33:17 ubuntu-jammy sh[5382]:         built at:         Mon Jun 12 03:49:19 UTC 2023
Jul 11 22:33:17 ubuntu-jammy sh[5382]:         built by:         Cesar Talledo
Jul 11 22:33:17 ubuntu-jammy sh[5382]:         oci-specs:         1.0.2-dev
Jul 11 22:33:17 ubuntu-jammy sh[5389]: sysbox-mgr
Jul 11 22:33:17 ubuntu-jammy sh[5389]:         edition:         Community Edition (CE)
Jul 11 22:33:17 ubuntu-jammy sh[5389]:         version:         0.6.2
Jul 11 22:33:17 ubuntu-jammy sh[5389]:         commit:         4b5fb1def9abe6a256cfe62bacaf2a7d333d81d2
Jul 11 22:33:17 ubuntu-jammy sh[5389]:         built at:         Mon Jun 12 03:49:55 UTC 2023
Jul 11 22:33:17 ubuntu-jammy sh[5389]:         built by:         Cesar Talledo
Jul 11 22:33:17 ubuntu-jammy sh[5394]: sysbox-fs
Jul 11 22:33:17 ubuntu-jammy sh[5394]:         edition:         Community Edition (CE)
Jul 11 22:33:17 ubuntu-jammy sh[5394]:         version:         0.6.2
Jul 11 22:33:17 ubuntu-jammy sh[5394]:         commit:         30fd49edbd51048fed8b2ad0af327598d30b29eb
Jul 11 22:33:17 ubuntu-jammy sh[5394]:         built at:         Mon Jun 12 03:49:46 UTC 2023
Jul 11 22:33:17 ubuntu-jammy sh[5394]:         built by:         Cesar Talledo

Lucky1313 avatar Jul 11 '23 22:07 Lucky1313

@Lucky1313, sorry for the delay. The sysbox-mgr logs above indicate that shiftfs is properly being detected and no operational issues are being found during initialization.

For that reason, I suspect that the problem is not with your container's root file-system, on which shiftfs is probably working fine. The UID mismatch that you are observing is probably specific to the resources being bind-mounted, not sure why, maybe due the fact that the underlying file-system is virtiofs (?)...

To confirm the above and help us narrow down the issue, please do the following:

  • Obtain a findmnt within the sysbox container where the UID mismatch issue is being observed.
  • Create a new sysbox container by bind-mounting a resource hosted in a file-system different than virtiofs; i.e., change this line in your Vagrantfile to avoid using: config.vm.synced_folder "./", "/home/vagrant/ws/", type: "virtiofs".
  • Finally, I would also suggest trying a VM with a more recent kernel (6.x+) as in that scenario Sysbox will switch to id-mapped mounts for UI shifting purposes, which is better support these days (even though I'm not sure if virtiofs is supported there yet).

rodnymolina avatar Jul 24 '23 04:07 rodnymolina

Verified that it is the synced folder that causes the issue, regardless of if it is a virtualbox or virtiofs folder. Interestingly, by the look of it, just having a synced folder in the directory tree of the internally docker mounted folder will cause the entire tree to have improperly mapped IDs (i.e. even though the synced folder in the VM is /home/vagrant/ws, all of /home/vagrant gets the bad ID when it's mounted inside the docker container). Paths without synced folders in them at all do get properly mounted inside the docker container correctly.

Tried upgrading kernel on both the virtualbox and libvirt providers, to both 5.19.0-50 and 6.2.0-26, no changes for any system.

So looks like synced folders in vagrant aren't supported at this time?

`findmnt` on Virtualbox
vagrant@ubuntu-jammy:~$ findmnt
TARGET                                        SOURCE                 FSTYPE      OPTIONS
/                                             /dev/sda1              ext4        rw,relatime,discard,errors=remount-ro
├─/sys                                        sysfs                  sysfs       rw,nosuid,nodev,noexec,relatime
│ ├─/sys/kernel/security                      securityfs             securityfs  rw,nosuid,nodev,noexec,relatime
│ ├─/sys/fs/cgroup                            cgroup2                cgroup2     rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot
│ ├─/sys/fs/pstore                            pstore                 pstore      rw,nosuid,nodev,noexec,relatime
│ ├─/sys/fs/bpf                               bpf                    bpf         rw,nosuid,nodev,noexec,relatime,mode=700
│ ├─/sys/kernel/debug                         debugfs                debugfs     rw,nosuid,nodev,noexec,relatime
│ ├─/sys/kernel/tracing                       tracefs                tracefs     rw,nosuid,nodev,noexec,relatime
│ ├─/sys/kernel/config                        configfs               configfs    rw,nosuid,nodev,noexec,relatime
│ └─/sys/fs/fuse/connections                  fusectl                fusectl     rw,nosuid,nodev,noexec,relatime
├─/proc                                       proc                   proc        rw,nosuid,nodev,noexec,relatime
│ └─/proc/sys/fs/binfmt_misc                  systemd-1              autofs      rw,relatime,fd=29,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=251
│   └─/proc/sys/fs/binfmt_misc                binfmt_misc            binfmt_misc rw,nosuid,nodev,noexec,relatime
├─/dev                                        udev                   devtmpfs    rw,nosuid,relatime,size=1980944k,nr_inodes=495236,mode=755,inode64
│ ├─/dev/pts                                  devpts                 devpts      rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000
│ ├─/dev/shm                                  tmpfs                  tmpfs       rw,nosuid,nodev,inode64
│ ├─/dev/hugepages                            hugetlbfs              hugetlbfs   rw,relatime,pagesize=2M
│ └─/dev/mqueue                               mqueue                 mqueue      rw,nosuid,nodev,noexec,relatime
├─/run                                        tmpfs                  tmpfs       rw,nosuid,nodev,noexec,relatime,size=400392k,mode=755,inode64
│ ├─/run/lock                                 tmpfs                  tmpfs       rw,nosuid,nodev,noexec,relatime,size=5120k,inode64
│ ├─/run/credentials/systemd-sysusers.service ramfs                  ramfs       ro,nosuid,nodev,noexec,relatime,mode=700
│ ├─/run/user/1000                            tmpfs                  tmpfs       rw,nosuid,nodev,relatime,size=400388k,nr_inodes=100097,mode=700,uid=1000,gid=1000,inode64
│ └─/run/snapd/ns                             tmpfs[/snapd/ns]       tmpfs       rw,nosuid,nodev,noexec,relatime,size=400392k,mode=755,inode64
│   └─/run/snapd/ns/lxd.mnt                   nsfs[mnt:[4026532195]] nsfs        rw
├─/snap/core20/1891                           /dev/loop0             squashfs    ro,nodev,relatime,errors=continue
├─/snap/lxd/24322                             /dev/loop1             squashfs    ro,nodev,relatime,errors=continue
├─/snap/snapd/19361                           /dev/loop2             squashfs    ro,nodev,relatime,errors=continue
├─/vagrant                                    vagrant                vboxsf      rw,relatime
│ └─/vagrant                                  vagrant                vboxsf      rw,relatime
└─/home/vagrant/ws                            home_vagrant_ws_       vboxsf      rw,relatime
  └─/home/vagrant/ws                          home_vagrant_ws_       vboxsf      rw,relatime
`findmnt` on libvirt
TARGET                                        SOURCE                            FSTYPE      OPTIONS
/                                             /dev/mapper/ubuntu--vg-ubuntu--lv ext4        rw,relatime
├─/sys                                        sysfs                             sysfs       rw,nosuid,nodev,noexec,relatime
│ ├─/sys/kernel/security                      securityfs                        securityfs  rw,nosuid,nodev,noexec,relatime
│ ├─/sys/fs/cgroup                            cgroup2                           cgroup2     rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot
│ ├─/sys/fs/pstore                            pstore                            pstore      rw,nosuid,nodev,noexec,relatime
│ ├─/sys/fs/bpf                               bpf                               bpf         rw,nosuid,nodev,noexec,relatime,mode=700
│ ├─/sys/kernel/tracing                       tracefs                           tracefs     rw,nosuid,nodev,noexec,relatime
│ ├─/sys/kernel/debug                         debugfs                           debugfs     rw,nosuid,nodev,noexec,relatime
│ ├─/sys/fs/fuse/connections                  fusectl                           fusectl     rw,nosuid,nodev,noexec,relatime
│ └─/sys/kernel/config                        configfs                          configfs    rw,nosuid,nodev,noexec,relatime
├─/proc                                       proc                              proc        rw,nosuid,nodev,noexec,relatime
│ └─/proc/sys/fs/binfmt_misc                  systemd-1                         autofs      rw,relatime,fd=29,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=18851
│   └─/proc/sys/fs/binfmt_misc                binfmt_misc                       binfmt_misc rw,nosuid,nodev,noexec,relatime
├─/dev                                        udev                              devtmpfs    rw,nosuid,relatime,size=4011824k,nr_inodes=1002956,mode=755,inode64
│ ├─/dev/pts                                  devpts                            devpts      rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000
│ ├─/dev/shm                                  tmpfs                             tmpfs       rw,nosuid,nodev,inode64
│ ├─/dev/hugepages                            hugetlbfs                         hugetlbfs   rw,relatime,pagesize=2M
│ └─/dev/mqueue                               mqueue                            mqueue      rw,nosuid,nodev,noexec,relatime
├─/run                                        tmpfs                             tmpfs       rw,nosuid,nodev,noexec,relatime,size=814028k,mode=755,inode64
│ ├─/run/lock                                 tmpfs                             tmpfs       rw,nosuid,nodev,noexec,relatime,size=5120k,inode64
│ ├─/run/credentials/systemd-sysusers.service none                              ramfs       ro,nosuid,nodev,noexec,relatime,mode=700
│ ├─/run/snapd/ns                             tmpfs[/snapd/ns]                  tmpfs       rw,nosuid,nodev,noexec,relatime,size=814028k,mode=755,inode64
│ │ └─/run/snapd/ns/lxd.mnt                   nsfs[mnt:[4026532381]]            nsfs        rw
│ └─/run/user/1000                            tmpfs                             tmpfs       rw,nosuid,nodev,relatime,size=814028k,nr_inodes=203507,mode=700,uid=1000,gid=1000,inode64
├─/snap/lxd/24322                             /dev/loop0                        squashfs    ro,nodev,relatime,errors=continue
├─/snap/core20/1822                           /dev/loop1                        squashfs    ro,nodev,relatime,errors=continue
├─/snap/snapd/18357                           /dev/loop2                        squashfs    ro,nodev,relatime,errors=continue
├─/boot                                       /dev/vda2                         ext4        rw,relatime
└─/home/vagrant/ws                            d3fa989972cdf06a7ed8de28edaa950   virtiofs    rw,relatime

Lucky1313 avatar Aug 01 '23 15:08 Lucky1313