sysbox icon indicating copy to clipboard operation
sysbox copied to clipboard

Error when attempting to use Docker with Sysbox on ZFS

Open nhoefer2 opened this issue 1 year ago • 4 comments

Using sysbox-ce_0.6.4-0.linux_amd64.deb

lsb_release -ar

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.1 LTS
Release:        24.04
Codename:       noble

uname -a Linux gil 6.8.0-45-generic #45-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 30 12:02:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

running the following command gives this error docker run --rm -d --runtime=sysbox-runc hello-world

96a24135a6eb0c92c892f3b39a2a1f7a955b775f3aba15211a2ae40485ddd5ab
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: container_linux.go:439: starting container process caused: process_linux.go:608: container init caused: process_linux.go:579: handleReqOp caused: rootfs_init_linux.go:325: chown overlayfs upper layet at %!s(MISSING) caused: failed to shift ACL for /main_pool/docker/overlay2/bb3273127fbba4cce77751c983fb08a2b077a7f55c398d21a3e208335c79580b/diff: failed to get ACL for /main_pool/docker/overlay2/bb3273127fbba4cce77751c983fb08a2b077a7f55c398d21a3e208335c79580b/diff: operation not supported: unknown.

I haven't the slightest idea on what's causing this or how to diagnose and resolve it. Any help would be greatly appreciated.

nhoefer2 avatar Oct 08 '24 19:10 nhoefer2

While my error is not explicitly mentioned here in the troubleshooting, I managed to find this: https://github.com/nestybox/sysbox/blob/master/docs/user-guide/troubleshoot.md#sysbox-logs

Failed to Setup Docker Volume Manager Error When creating a system container, Docker may report the following error:

docker run --runtime=sysbox-runc -it ubuntu:latest docker: Error response from daemon: OCI runtime create failed: failed to setup docker volume manager: host dir for docker store /var/lib/sysbox/docker can't be on ..." This means that Sysbox's /var/lib/sysbox directory is on a filesystem not supported by Sysbox.

This directory must be on one of the following filesystems:

ext4 btrfs The same requirement applies to the /var/lib/docker directory.

This is normally the case for vanilla Ubuntu installations, so this error is not common.

I should mention that my docker data directory is on a folder in a zfs pool.

I setup my system so all data is on the zfs pool which is completely isolated from the OS disk (which has VERY limited capacity). Am I shit out of luck if I'm trying to use zfs? Will this ever be supported in the future?

nhoefer2 avatar Oct 08 '24 20:10 nhoefer2

After countless hours of banging my head into the wall I finally figured it out. Sysbox requires POSIX ACLs on the filesystem which can be enabled on zfs using the following commands

zfs set acltype=posixacl poolname/datasetname
zfs set xattr=sa poolname/datasetname
umount /mountpoint
zfs mount poolname

May I suggest to the developers here to perhaps include something about this a bit more explicitly somewhere in the configuration or troubleshooting guides?

nhoefer2 avatar Oct 08 '24 21:10 nhoefer2

Thanks @nhoefer2 for trying Sysbox and figuring out the problem with running Docker + Sysbox on ZFS (i.e., Posix ACLs need to be enabled).

I am actually surprised Posix ACLs are not enabled by default on ZFS, given that it's the standard.

Let's keep this issue open in case someone else bumps into the same problem. I've renamed the title based on your findings.

Thanks again!

ctalledo avatar Oct 11 '24 01:10 ctalledo

Thanks @nhoefer2 , you solved a problem that had been bugging me for a while. I've tested it on Ubuntu 24.04 with kernel versions 6.8, and Debian bookworm with kernel versions 6.10, everything works perfectly. I installed openzfs 2.2, and the command to create the zpool is as follows:

zpool create \
    -o ashift=12 \
    -o autotrim=on \
    -o compatibility=openzfs-2.2-linux \
    -O acltype=posixacl \
    -O xattr=sa \
    -O dnodesize=auto \
    -O compression=lz4 \
    -O normalization=formD \
    -O relatime=on \
    poolname /dev/sdx

ondh avatar Oct 24 '24 06:10 ondh

Hi folks, this should fix it (meaning Sysbox will work regardless of whether the underlying filesystem supports ACLs or not): https://github.com/nestybox/sysbox-libs/pull/56

ctalledo avatar Nov 04 '24 20:11 ctalledo

PR merged; fix will be present in Sysbox v0.6.5.

Closing.

ctalledo avatar Nov 04 '24 23:11 ctalledo