sysbox
sysbox copied to clipboard
Unable to Start Docker Daemon in Sysbox Container with NFS Mount
Hi, the Docker daemon fails to start within a Sysbox container when the Docker directory is mounted from an NFS server. The issue appears to be related to permissions on the NFS-mounted directory.
On NFS Server
cat /etc/exports
/var/cs/users 10.0.105.0/24(rw,no_subtree_check,all_squash,anonuid=166537,anongid=165536)
Where, 166537 is 165536 + 1001 (1001 is GUI of the user in a Docker container)
ls -lah /var/cs/users/952/u-9524/.ide
total 12K
drwxr-xr-x 5 166537 165536 113 Jun 9 02:05 .
drwxr-xr-x 22 166537 165536 4.0K Jun 13 14:21 ..
drwx--x--- 12 166537 165536 214 Jun 12 18:58 docker
-rw-r--r-- 1 166537 165536 35 Jun 9 02:05 info.json
sudo ls -lah /var/cs/users/952/u-9524/.ide/docker
total 4.0K
drwx--x--- 12 166537 165536 214 Jun 12 18:58 .
drwxr-xr-x 5 166537 165536 113 Jun 9 02:05 ..
drwx--x--x 4 166537 165536 170 Jun 9 02:05 buildkit
drwx--x--- 2 166537 165536 10 Jun 12 18:58 containers
-rw------- 1 166537 165536 36 Jun 9 02:05 engine-id
drwx------ 3 166537 165536 25 Jun 9 02:05 image
drwxr-x--- 3 166537 165536 27 Jun 9 02:05 network
drwx------ 4 166537 165536 44 Jun 9 02:05 plugins
drwx------ 2 166537 165536 10 Jun 12 18:58 runtimes
drwx------ 2 166537 165536 10 Jun 9 02:05 swarm
drwx------ 2 166537 165536 10 Jun 12 18:58 tmp
drwx--x--- 3 166537 165536 25 Jun 9 02:06 vfs
drwx-----x 2 166537 165536 33 Jun 9 02:05 volumes
On Container's Host
cat /etc/docker/daemon.json
{
"userns-remap": "sysbox",
"runtimes": {
"sysbox-runc": {
"path": "/usr/bin/sysbox-runc"
}
},
"bip": "172.20.0.1/16",
"default-address-pools": [
{
"base": "172.25.0.0/16",
"size": 24
}
],
"insecure-registries": [
"10.0.200.37:5000"
]
}
sudo ls -lah /mnt/nfs/users/952/u-9524/.ide
total 12K
drwxr-xr-x 5 166537 165536 113 Jun 9 02:05 .
drwxr-xr-x 22 166537 165536 4.0K Jun 13 14:46 ..
drwx--x--- 12 166537 165536 214 Jun 12 18:58 docker
-rw-r--r-- 1 166537 165536 35 Jun 9 02:05 info.json
sudo ls -lah /mnt/nfs/users/952/u-9524/.ide/docker
total 4.0K
drwx--x--- 12 166537 165536 214 Jun 12 18:58 .
drwxr-xr-x 5 166537 165536 113 Jun 9 02:05 ..
drwx--x--x 4 166537 165536 170 Jun 9 02:05 buildkit
drwx--x--- 2 166537 165536 10 Jun 12 18:58 containers
-rw------- 1 166537 165536 36 Jun 9 02:05 engine-id
drwx------ 3 166537 165536 25 Jun 9 02:05 image
drwxr-x--- 3 166537 165536 27 Jun 9 02:05 network
drwx------ 4 166537 165536 44 Jun 9 02:05 plugins
drwx------ 2 166537 165536 10 Jun 12 18:58 runtimes
drwx------ 2 166537 165536 10 Jun 9 02:05 swarm
drwx------ 2 166537 165536 10 Jun 12 18:58 tmp
drwx--x--- 3 166537 165536 25 Jun 9 02:06 vfs
drwx-----x 2 166537 165536 33 Jun 9 02:05 volumes
Inside a container
ls -lah /home/user/.ide
total 16K
drwxr-xr-x 5 user root 113 Jun 9 02:05 .
drwxr-xr-x 22 user root 4.0K Jun 13 14:41 ..
drwx--x--- 12 user root 214 Jun 12 18:58 docker
-rw-r--r-- 1 user root 35 Jun 9 02:05 info.json
sudo ls -lah /home/user/.ide/docker
total 4.0K
drwx--x--- 12 user root 214 Jun 12 18:58 .
drwxr-xr-x 5 user root 113 Jun 9 02:05 ..
drwx--x--x 4 user root 170 Jun 9 02:05 buildkit
drwx--x--- 2 user root 10 Jun 12 18:58 containers
-rw------- 1 user root 36 Jun 9 02:05 engine-id
drwx------ 3 user root 25 Jun 9 02:05 image
drwxr-x--- 3 user root 27 Jun 9 02:05 network
drwx------ 4 user root 44 Jun 9 02:05 plugins
drwx------ 2 user root 10 Jun 12 18:58 runtimes
drwx------ 2 user root 10 Jun 9 02:05 swarm
drwx------ 2 user root 10 Jun 12 18:58 tmp
drwx--x--- 3 user root 25 Jun 9 02:06 vfs
drwx-----x 2 user root 33 Jun 9 02:05 volumes
sudo systemctl restart docker || journalctl -u docker
Job for docker.service failed because the control process exited with error code.
See "systemctl status docker.service" and "journalctl -xeu docker.service" for details.
Jun 13 14:45:05 3c5adbffe1a5 systemd[1]: Starting Docker Application Container Engine...
Jun 13 14:45:05 3c5adbffe1a5 dockerd[104066]: time="2024-06-13T14:45:05.339870626Z" level=info msg="Starting up"
Jun 13 14:45:05 3c5adbffe1a5 dockerd[104066]: could not create or set daemon root permissions: /home/user/.ide/docker: chown /home/user/.ide/docker: operation not permitted
Jun 13 14:45:05 3c5adbffe1a5 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 13 14:45:05 3c5adbffe1a5 systemd[1]: docker.service: Failed with result 'exit-code'.
Jun 13 14:45:05 3c5adbffe1a5 systemd[1]: Failed to start Docker Application Container Engine.
Container created with mount
{
Source: '/mnt/nfs/users/952/u-9524',
Target: '/home/user',
Type: 'bind',
ReadOnly: false,
BindOptions: {
Propagation: 'rprivate'
}
}
Just to give you a little context, we're using NFS share to store Docker data from the container. This way, we can quickly start up our containers and have a shared storage system.
I've also attempted to use separate NFS shares like so:
cat /etc/exports
/var/cs/home 10.0.105.0/24(rw,no_subtree_check,all_squash,anonuid=166537,anongid=165536) # user:root
/var/cs/docker 10.0.105.0/24(rw,no_subtree_check,all_squash,anonuid=165536,anongid=165536) # root:root
Next, I tried to mount them separately:
{
Source: '/mnt/nfs/home/952/u-9524',
Target: '/home/user',
Type: 'bind' as MountType,
ReadOnly: false,
BindOptions: {
Propagation: 'rprivate' as MountPropagation
}
},
{
Source: '/mnt/nfs/docker/952/u-9524',
Target: '/var/lib/docker',
Type: 'bind' as MountType,
ReadOnly: false,
BindOptions: {
Propagation: 'rprivate' as MountPropagation
}
}
However, this resulted in a container start failure with an error message:
(HTTP code 500) server error - failed to create task for container: failed to create shim task: OCI runtime create failed: error in the container spec: invalid mount config: failed to request mount source preps from sysbox-mgr: failed to invoke PrepMounts via grpc: rpc error: code = Unknown desc = failed to shift uids via chown for mount source at /mnt/nfs/docker/952/u-9524: failed to shift ACL for /mnt/nfs/docker/952/u-9524: failed to get ACL for /mnt/nfs/docker/952/u-9524: operation not supported: unknown
Then I decided to use a different directory for docker, like Target: '/var/lib/my-docker' and change docker directory too:
/etc/docker/daemon.json
{"data-root": "/var/lib/my-docker"}
The container started successfully and the shares seemed fine. Yet, when I tried to pull an image inside a containre with docker pull ubuntu, I encountered this:
docker pull ubuntu
Using default tag: latest
latest: Pulling from library/ubuntu
00d679a470c4: Extracting [==================================================>] 28.87MB/28.87MB
failed to register layer: failed to Lchown "/etc/gshadow" for UID 0, GID 42: lchown /etc/gshadow: operation not permitted
So, as you can see, I'm pretty much stuck right now :-)
findmnt
TARGET SOURCE FSTYPE OPTIONS
/ overlay overlay rw,relatime,lowerdir=/var/lib/docker/165536.165536/overlay2/l/QIMHXFAT44O2GT5IYCFJTAXF3S:/var/lib/docker/165536.165536/overlay2/l/O5POPDBVEQFUKY7ZDZXGQTVN4Z:/var/lib/docker/165536.165536/overlay2/l/CDE7PAY527
|-/sys sysfs sysfs rw,nosuid,nodev,noexec,relatime
| |-/sys/firmware tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/sys/fs/cgroup tmpfs tmpfs ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755,uid=165536,gid=165536,inode64
| | |-/sys/fs/cgroup/systemd systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd
| | |-/sys/fs/cgroup/pids cgroup cgroup rw,nosuid,nodev,noexec,relatime,pids
| | |-/sys/fs/cgroup/perf_event cgroup cgroup rw,nosuid,nodev,noexec,relatime,perf_event
| | |-/sys/fs/cgroup/cpuset cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpuset
| | |-/sys/fs/cgroup/blkio cgroup cgroup rw,nosuid,nodev,noexec,relatime,blkio
| | |-/sys/fs/cgroup/devices cgroup cgroup rw,nosuid,nodev,noexec,relatime,devices
| | |-/sys/fs/cgroup/net_cls,net_prio cgroup cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio
| | |-/sys/fs/cgroup/cpu,cpuacct cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct
| | |-/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,freezer
| | |-/sys/fs/cgroup/memory cgroup cgroup rw,nosuid,nodev,noexec,relatime,memory
| | |-/sys/fs/cgroup/rdma cgroup cgroup rw,nosuid,nodev,noexec,relatime,rdma
| | |-/sys/fs/cgroup/misc cgroup cgroup rw,nosuid,nodev,noexec,relatime,misc
| | `-/sys/fs/cgroup/hugetlb cgroup cgroup rw,nosuid,nodev,noexec,relatime,hugetlb
| |-/sys/devices/virtual sysboxfs[/sys/devices/virtual] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| | `-/sys/devices/virtual/powercap tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/sys/kernel sysboxfs[/sys/kernel] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| `-/sys/module/nf_conntrack/parameters
| sysboxfs[/sys/module/nf_conntrack/parameters] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
|-/proc proc proc rw,nosuid,nodev,noexec,relatime
| |-/proc/bus proc[/bus] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/fs proc[/fs] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/irq proc[/irq] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/sysrq-trigger proc[/sysrq-trigger] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/acpi tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/proc/keys udev[/null] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/proc/timer_list udev[/null] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/proc/scsi tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/proc/swaps sysboxfs[/proc/swaps] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| |-/proc/sys sysboxfs[/proc/sys] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| `-/proc/uptime sysboxfs[/proc/uptime] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
|-/dev tmpfs tmpfs rw,nosuid,size=65536k,mode=755,uid=165536,gid=165536,inode64
| |-/dev/mqueue mqueue mqueue rw,nosuid,nodev,noexec,relatime
| |-/dev/pts devpts devpts rw,nosuid,noexec,relatime,gid=165541,mode=620,ptmxmode=666
| |-/dev/shm shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k,uid=165536,gid=165536,inode64
| |-/dev/null udev[/null] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/dev/random udev[/random] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/dev/kmsg udev[/null] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/dev/full udev[/full] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/dev/tty udev[/tty] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| |-/dev/zero udev[/zero] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
| `-/dev/urandom udev[/urandom] devtmpfs rw,nosuid,relatime,size=8145820k,nr_inodes=2036455,mode=755,inode64
|-/run tmpfs tmpfs rw,nosuid,nodev,relatime,size=65536k,mode=755,uid=165536,gid=165536,inode64
| |-/run/user/1001 tmpfs tmpfs rw,nosuid,nodev,relatime,size=419428k,nr_inodes=104857,mode=700,uid=166537,gid=165536,inode64
| `-/run/lock tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,size=4096k,uid=165536,gid=165536,inode64
|-/home/user 10.0.200.70:/var/cs/home/952/u-9524 nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.105.5,local_lock=none,addr=10.0.200.70
|-/cs-sockets /dev/sda1[/etc/nginx/sockets/u9524] ext4 rw,relatime,idmapped,discard,errors=remount-ro
|-/etc/resolv.conf /dev/sdc[/165536.165536/containers/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7/resolv.conf]
| xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota
|-/etc/hostname /dev/sdc[/165536.165536/containers/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7/hostname]
| xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota
|-/etc/hosts /dev/sdc[/165536.165536/containers/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7/hosts]
| xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota
|-/var/lib/my-docker 10.0.200.70:/var/cs/docker/952/u-9524 nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.105.5,local_lock=none,addr=10.0.200.70
|-/var/lib/kubelet /dev/sda1[/var/lib/sysbox/kubelet/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/k0s /dev/sda1[/var/lib/sysbox/k0s/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/buildkit /dev/sda1[/var/lib/sysbox/buildkit/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs
| /dev/sda1[/var/lib/sysbox/containerd/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/docker /dev/sda1[/var/lib/sysbox/docker/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/rancher/k3s /dev/sda1[/var/lib/sysbox/rancher-k3s/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/rancher/rke2 /dev/sda1[/var/lib/sysbox/rancher-rke2/72f3997457a639da4038fbce18d0de30707bc4c9ddc0cb066702174ce5a658f7]
| ext4 rw,relatime,discard,errors=remount-ro
|-/usr/src/linux-headers-5.15.0-112 /dev/sda1[/usr/src/linux-headers-5.15.0-112] ext4 ro,relatime,idmapped,discard,errors=remount-ro
|-/usr/src/linux-headers-5.15.0-112-generic
| /dev/sda1[/usr/src/linux-headers-5.15.0-112-generic] ext4 ro,relatime,idmapped,discard,errors=remount-ro
`-/usr/lib/modules/5.15.0-112-generic /dev/sda1[/usr/lib/modules/5.15.0-112-generic] ext4 ro,relatime,idmapped,discard,errors=remount-ro
I've tried once more to make things work, but unfortunately, I haven't been successful.
I realized that I had incorrectly configured NFS sharing; the idmapping for NFS wasn't functioning as it should, and I was using an older kernel that didn't support "Overlayfs on ID-mapped mounts". I've corrected all of these issues, but I still can't run the command sudo chown root /home/user/text.txt within the container. Oddly enough, I can execute any command from the Sysbox host (on the NFS client side).
Here are the configurations on my NFS Server:
cat /etc/exports
/var/cs/home 10.0.105.0/24(rw,no_subtree_check,no_root_squash)
cat /etc/idmapd.conf
[General]
Verbosity = 0
# set your own domain here, if it differs from FQDN minus hostname
# Domain = localdomain
Domain = lan
[Mapping]
Nobody-User = nobody
Nobody-Group = nogroup
In addition to the above, I ran this command: sudo echo N > /sys/module/nfsd/parameters/nfs4_disable_idmapping.
The command
ls -lah /var/cs/home/952/u-9524/provides the following:
-rw-rw-r-- 1 cs-user cs-root 0 Jun 22 15:13 text.txt
(Here, I've added cs-user and cs-root using the specified commands for easier reading. sudo useradd -u 165536 cs-root and sudo useradd -u 166537 cs-user)
On the NFS client (Sysbox host) side, I have this setup:
The mount command is
sudo mount 10.0.200.70:/var/cs/home /mnt/nfs/home
The output of
ls -lah /mnt/nfs/home/952/u-9524on the host is
-rw-rw-r-- 1 cs-user 165536 0 Jun 22 15:13 text.txt
The command sudo chown cs-ubuntu /mnt/nfs/home/952/u-9524/text.txt works just fine. But when I try to do the same inside a container, here's what happens:
Then, I'm trying to make the same inside a container
docker run --rm --name tmp -it --runtime sysbox-runc -v /mnt/nfs/home/952/u-9524:/home/user 44c062a02c99 /bin/sh -c "ls -lah /home/user && chown root /home/user/text.txt"
The result is
-rw-rw-r-- 1 user root 0 Jun 22 15:13 /home/user/text.txt
chown: changing ownership of '/home/user/text.txt': Operation not permitted
idmapping seems to be working:
journalctl --no-pager -u sysbox-mgr
Jun 21 21:33:38 worker-5 systemd[1]: Starting sysbox-mgr (part of the Sysbox container runtime)...
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Starting ..."
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Sysbox data root: /var/lib/sysbox"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Shiftfs module found in kernel: no"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Shiftfs works properly: no"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Shiftfs-on-overlayfs works properly: no"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="ID-mapped mounts supported by kernel: yes"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Overlayfs on ID-mapped mounts supported by kernel: yes"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Operating in system container mode."
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Inner container image preloading enabled."
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Listening on /run/sysbox/sysmgr.sock"
Jun 21 21:33:38 worker-5 sysbox-mgr[1006]: time="2024-06-21 21:33:38" level=info msg="Ready ..."
Jun 21 21:33:38 worker-5 systemd[1]: Started sysbox-mgr (part of the Sysbox container runtime).
cat /etc/idmapd.conf
[General]
Verbosity = 0
# set your own domain here, if it differs from FQDN minus hostname
# Domain = localdomain
Domain = lan
[Mapping]
Nobody-User = nobody
Nobody-Group = nogroup
Here's what my
/etc/docker/daemon.jsonfile looks like:
{
"userns-remap": "sysbox",
"runtimes": {
"sysbox-runc": {
"path": "/usr/bin/sysbox-runc"
}
},
"bip": "172.20.0.1/16",
"default-address-pools": [
{
"base": "172.25.0.0/16",
"size": 24
}
],
"insecure-registries": [
"10.0.200.37:5000"
]
}
findmnt
docker run --rm --name tmp -it --runtime sysbox-runc -v /mnt/nfs/home/952/u-9524:/home/user 44c062a02c99 findmnt
TARGET SOURCE FSTYPE OPTIONS
/ overlay overlay rw,relatime,lowerdir=/var/lib/docker/165536.165536/overlay2/l/ZJ3XHL4OSA67GW46RG7OWZIW63:/var/lib/docker/165536.165536/overlay2/l/HNIO4YVNNN34FFILPGJXBNEGXE:/var/lib/docker/165536.165536/overlay2/l/IV34VEYV47FU3UC7S6FETFPAE5:/var/lib
|-/sys sysfs sysfs rw,nosuid,nodev,noexec,relatime
| |-/sys/firmware tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/sys/fs/cgroup tmpfs tmpfs rw,nosuid,nodev,noexec,relatime,mode=755,uid=165536,gid=165536,inode64
| | |-/sys/fs/cgroup/systemd systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd
| | |-/sys/fs/cgroup/perf_event cgroup cgroup rw,nosuid,nodev,noexec,relatime,perf_event
| | |-/sys/fs/cgroup/memory cgroup cgroup rw,nosuid,nodev,noexec,relatime,memory
| | |-/sys/fs/cgroup/blkio cgroup cgroup rw,nosuid,nodev,noexec,relatime,blkio
| | |-/sys/fs/cgroup/net_cls,net_prio cgroup cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio
| | |-/sys/fs/cgroup/misc cgroup cgroup rw,nosuid,nodev,noexec,relatime,misc
| | |-/sys/fs/cgroup/cpuset cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpuset
| | |-/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,freezer
| | |-/sys/fs/cgroup/cpu,cpuacct cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct
| | |-/sys/fs/cgroup/hugetlb cgroup cgroup rw,nosuid,nodev,noexec,relatime,hugetlb
| | |-/sys/fs/cgroup/pids cgroup cgroup rw,nosuid,nodev,noexec,relatime,pids
| | |-/sys/fs/cgroup/rdma cgroup cgroup rw,nosuid,nodev,noexec,relatime,rdma
| | `-/sys/fs/cgroup/devices cgroup cgroup rw,nosuid,nodev,noexec,relatime,devices
| |-/sys/devices/virtual sysboxfs[/sys/devices/virtual] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| |-/sys/kernel sysboxfs[/sys/kernel] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| `-/sys/module/nf_conntrack/parameters sysboxfs[/sys/module/nf_conntrack/parameters] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
|-/proc proc proc rw,nosuid,nodev,noexec,relatime
| |-/proc/bus proc[/bus] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/fs proc[/fs] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/irq proc[/irq] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/sysrq-trigger proc[/sysrq-trigger] proc ro,nosuid,nodev,noexec,relatime
| |-/proc/asound tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/proc/acpi tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/proc/keys udev[/null] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/proc/timer_list udev[/null] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/proc/scsi tmpfs tmpfs ro,relatime,uid=165536,gid=165536,inode64
| |-/proc/swaps sysboxfs[/proc/swaps] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| |-/proc/sys sysboxfs[/proc/sys] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
| `-/proc/uptime sysboxfs[/proc/uptime] fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other
|-/dev tmpfs tmpfs rw,nosuid,size=65536k,mode=755,uid=165536,gid=165536,inode64
| |-/dev/console devpts[/0] devpts rw,nosuid,noexec,relatime,gid=165541,mode=620,ptmxmode=666
| |-/dev/mqueue mqueue mqueue rw,nosuid,nodev,noexec,relatime
| |-/dev/pts devpts devpts rw,nosuid,noexec,relatime,gid=165541,mode=620,ptmxmode=666
| |-/dev/shm shm tmpfs rw,nosuid,nodev,noexec,relatime,size=65536k,uid=165536,gid=165536,inode64
| |-/dev/null udev[/null] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/dev/random udev[/random] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/dev/kmsg udev[/null] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/dev/full udev[/full] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/dev/tty udev[/tty] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| |-/dev/zero udev[/zero] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
| `-/dev/urandom udev[/urandom] devtmpfs rw,nosuid,relatime,size=8097384k,nr_inodes=2024346,mode=755,inode64
|-/home/user 10.0.200.70:/var/cs/home/952/u-9524 nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.105.5,local_lock=none,addr=10.0.200.70
| ext4 rw,relatime,discard,errors=remount-ro
|-/etc/resolv.conf /dev/sdc[/165536.165536/containers/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade/resolv.conf]
| xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota
|-/etc/hostname /dev/sdc[/165536.165536/containers/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade/hostname]
| xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota
|-/etc/hosts /dev/sdc[/165536.165536/containers/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade/hosts]
| xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,usrquota,prjquota,grpquota
|-/var/lib/kubelet /dev/sda1[/var/lib/sysbox/kubelet/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/k0s /dev/sda1[/var/lib/sysbox/k0s/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/buildkit /dev/sda1[/var/lib/sysbox/buildkit/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs
| /dev/sda1[/var/lib/sysbox/containerd/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/rancher/k3s /dev/sda1[/var/lib/sysbox/rancher-k3s/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade]
| ext4 rw,relatime,discard,errors=remount-ro
|-/var/lib/rancher/rke2 /dev/sda1[/var/lib/sysbox/rancher-rke2/10390c0e6738870f816f816328e1d81764c9d2ff0cb4113f1faf2f82546c6ade]
| ext4 rw,relatime,discard,errors=remount-ro
|-/usr/src/linux-headers-6.5.0-41-generic
| /dev/sda1[/usr/src/linux-headers-6.5.0-41-generic] ext4 ro,relatime,idmapped,discard,errors=remount-ro
|-/usr/src/linux-hwe-6.5-headers-6.5.0-41
| /dev/sda1[/usr/src/linux-hwe-6.5-headers-6.5.0-41] ext4 ro,relatime,idmapped,discard,errors=remount-ro
`-/usr/lib/modules/6.5.0-41-generic /dev/sda1[/usr/lib/modules/6.5.0-41-generic] ext4 ro,relatime,idmapped,discard,errors=remount-ro
At this point, I'm close to giving up...
I'm not sure if this will be helpful but try taking a look at this: https://github.com/nestybox/sysbox/issues/849
Thanks @nhoefer2
That looks like it could really help. I’ll give it a try next week and let you know how it goes. By the way, I’m using the XFS file system, but I’m not sure if ACL was enabled or not.
Unfortunately, it does not help me. I decided to give it another shot at attaching an NFS volume to a Docker container. Below, I’ve listed all the steps I took to reproduce the issue (there weren’t that many).
Description:
When attempting to mount NFS volumes on a Docker container, the Docker daemon fails to start inside the container. This issue occurs on a setup involving two hosts running Ubuntu 24.04, with one acting as an NFS server and the other as an NFS client.
Environment:
- Hosts: 2 hosts with Ubuntu 24.04
- NFS Server: XFS file system on Host 1
- NFS Client: Mounts NFS share from Host 1 to Host 2
Steps to Reproduce:
-
Configure NFS Server (Host 1):
- Install NFS server:
sudo apt -y install nfs-kernel-server nfs4-acl-tools - Configure
/etc/idmapd.conf:Domain = ide.lan - Update
/etc/exports:/data 10.0.105.0/24(rw,no_root_squash) - Apply changes:
sudo systemctl restart nfs-server
- Install NFS server:
-
Configure NFS Client (Host 2):
- Install NFS client:
sudo apt -y install nfs-common nfs4-acl-tools sudo nano /etc/idmapd.conf # Domain = ide.lan sudo nano /etc/fstab # nfs-server.ide.lan:/data /mnt/nfs_share nfs defaults 0 0 - Mount NFS share:
sudo mkdir /mnt/nfs_share sudo mount -a
- Install NFS client:
-
Verify ACL (Access Control List):
- On the NFS server:
sudo setfacl -m g:root:rwx /data/docker - On the NFS client:
getfacl /mnt/nfs_share/docker
- On the NFS server:
-
Run Docker Container:
docker run --runtime sysbox-runc --name nfs-poc --rm -it -v /mnt/nfs_share/docker:/var/lib/docker nestybox/ubuntu-noble-systemd-docker:latest -
Observe Docker Service Failure in Logs:
- Inside the container, the following error appears:
chmod /var/lib/docker: operation not permitted
- Inside the container, the following error appears:
Error Logs:
Oct 13 19:43:54 f9d74a2ab68a systemd[1]: Failed to start docker.service - Docker Application Container Engine.
Oct 13 19:43:56 f9d74a2ab68a dockerd[1383]: chmod /var/lib/docker: operation not permitted
Additional Behavior:
When restarting the container multiple times, sysbox-mgr shows warnings that the NFS share is already mounted in another container:
systemd[1]: Starting sysbox-mgr.service - sysbox-mgr (part of the Sysbox container runtime)...
time="2024-10-13 19:42:45" level=info msg="Starting ..."
time="2024-10-13 19:42:45" level=info msg="Sysbox data root: /var/lib/sysbox"
time="2024-10-13 19:42:45" level=info msg="Shiftfs module found in kernel: no"
time="2024-10-13 19:42:45" level=info msg="Shiftfs works properly: no"
time="2024-10-13 19:42:45" level=info msg="Shiftfs-on-overlayfs works properly: no"
time="2024-10-13 19:42:45" level=info msg="ID-mapped mounts supported by kernel: yes"
time="2024-10-13 19:42:45" level=info msg="Overlayfs on ID-mapped mounts supported by kernel: yes"
time="2024-10-13 19:42:45" level=info msg="Operating in system container mode."
time="2024-10-13 19:42:45" level=info msg="Inner container image preloading enabled."
time="2024-10-13 19:42:45" level=info msg="Listening on /run/sysbox/sysmgr.sock"
time="2024-10-13 19:42:45" level=info msg="Ready ..."
sysbox-mgr[939]: mount source at /mnt/nfs_share/docker should be mounted in one container only, but is already mounted in containers [f9d74a2ab68a...]
Expected Behavior:
Docker should start successfully inside the container, and NFS shares should be correctly mounted without permission issues or overlapping mounts.
Actual Behavior:
The Docker daemon fails to start due to permission issues on the NFS-mounted directory. Additionally, sysbox-mgr reports that the same NFS mount source is being reused across multiple containers, leading to errors.
Hi @bushev, thanks for trying Sysbox, hope you find it useful.
I suspect the problem you are having is that Sysbox uses shiftfs or ID-mapped-mounts on host directories mounted into the container, and I don't believe either of these mechanisms work on top of NFS mounts (unfortunately).
For example, when you do
docker run --runtime sysbox-runc --name nfs-poc --rm -it -v /mnt/nfs_share/docker:/var/lib/docker nestybox/ubuntu-noble-systemd-docker:latest
how does ls -l /var/lib/docker look from inside the container?
Hey Cesar, thanks for looking into that!
I just tried what you suggested, and strangely enough, the previous error seems to have disappeared. I can now confirm that Docker is starting within the container. This might be related to the fact that I rebooted the servers several times and enabled ACL with different parameters afterward. I can’t fully explain why, but it started working, and it seems to be functional for now.
However, when I attempted to pull an image, for example, for MySQL, I encountered an error at the end stating that it couldn’t create a symbolic link. I believe this might be due to a limitation related to NFS and how it’s mounted inside the Sysbox container, but this is clearly a separate issue. Hopefully, this will be the last problem preventing full NFS compatibility.
Hi @bushev, that's progress, thanks.
I don't know however what could be causing the latest error you see when the image gets pulled by Docker inside the Sysbox contaienr. Does it occur with other images? Say for example, does docker run -it --rm alpine work?
Hmm, no this doesn’t work either, but the error is somehow related to a symlink as before.
user@8c397c02138d:~$ docker run -it --rm alpine
Unable to find image 'alpine:latest' locally
latest: Pulling from library/alpine
43c4264eed91: Extracting [==================================================>] 3.624MB/3.624MB
docker: failed to register layer: failed to Lchown "/etc/shadow" for UID 0, GID 42: lchown /etc/shadow: operation not permitted.
I suspect the issue you are facing is not so much related to Sysbox, as it is related to placing /var/lib/docker on an NFS mount. I am pretty sure that if you do the same without Sysbox (e.g., by simply configuring the Docker engine's data-root to an NFS backed directory), you'll see the same error.
Now, as to why it fails, I don't know. But it's probably due to limitations on NFS. Figuring that out would require a deeper investigation.
If I am incorrect and you believe the problem is specific to running Docker engine on Sysbox, then we can dig further to see why that is. But I don't see any indication of this, the problem appears to be related to NFS than anything else.