sysbox icon indicating copy to clipboard operation
sysbox copied to clipboard

Unable to start container with sysbox runtime after kernel update.

Open netlore opened this issue 3 years ago • 28 comments

Running Ubuntu 22.04, and just received kernel update from 5.15.0-47 to 5.15.0-48, matching this security advisory, and It seems that containers can no-longer be started with the runtime:-

https://ubuntu.com/security/notices/USN-5624-1

# docker run --runtime sysbox-runc -it nestybox/ubuntu-focal-docker:latest /bin/bash
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: process_linux.go:578: handleReqOp caused: rootfs_init_linux.go:366: failed to mkdirall /var/lib/sysbox/shiftfs/002a816d-d852-4e38-ac0d-d6b37bbdd8ea/var/lib/rancher/rke2: mkdir /var/lib/sysbox/shiftfs/002a816d-d852-4e38-ac0d-d6b37bbdd8ea/var/lib/rancher: value too large for defined data type caused: mkdir /var/lib/sysbox/shiftfs/002a816d-d852-4e38-ac0d-d6b37bbdd8ea/var/lib/rancher: value too large for defined data type: unknown.

Rolled back to 5.15.0-47 and it seems to be working again.

Installed from "sysbox-ce_0.5.2-0.linux_amd64.deb" - Any thoughts would be appreciated.

netlore avatar Sep 22 '22 18:09 netlore

Hi @netlore , thanks for using Sysbox.

The error is for sure caused by incompatibility between shiftfs and the kernel (nothing in sysbox per-se).

Just yesterday someone else reported this issue too: https://github.com/nestybox/sysbox/issues/595

There must be something in the 5.15.0-48 kernel that is causing the incompatibility with the shiftfs module. Speculating a bit, maybe the kernel is missing a Ubuntu patch required for overlayfs to work with shiftfs, or maybe the shiftfs module needs updating to work with this kernel.

We would need to dig deep into the commits of 5.15.0-48 to see what's going on.

If rolling back to the prior kernel is not an option for you, as a workaround you can try using a newer kernel (maybe 5.18?) or configuring Sysbox to not use shiftfs (it will instead use an alternative mechanism called ID-mapped-mounts in the kernel). To do the latter, modify the sysbox systemd service for the sysbox-mgr and pass the --disable-shiftfs flag to it.

Disabling shiftfs is not ideal, but things should still work without it.

ctalledo avatar Sep 23 '22 00:09 ctalledo

Oh wow, I've been looking into the situation with shiftfs, and it seems there could be some serious confusion going on... it seems that the upstream kernel is going for ID-Mapped mounts, but it's not yet supported for ZFS or CephFS... and shiftfs has never been officially upstreamed, but Canonical are carrying patches to include it in their kernels through 22.04... I'll need to review the diffs tomorrow, but I believe there were changes to shiftfs between -47 and -48.... perhaps because of the 11 CVE's that -48 addressed.

Can you clarify why you favour shiftfs, as you said that disabling shiftfs (using ID-Mapped mounts) is not ideal... I'd like to understand what's not ideal about it (other than the current lack of support for ZFS/CephFS).

I can of course update you with whatever details I find regarding changes to shiftfs in -48 (if you're interested)... in the morning.

netlore avatar Sep 23 '22 01:09 netlore

I noticed in the changelog for Canonical's kernel that 5.15.0-48 includes a resync with upstream, I wonder if they lost their patch that allows shiftfs to work with overlayfs, i feel like that would break things in the above kind of way.... here's the patch for that

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846272

Will check to see if it's actually the case that this is missing, and/or if it was accidental or.....

netlore avatar Sep 23 '22 01:09 netlore

This seems to show a related check in overlayfs, being reverted to mainline, could this be the smoking gun?

diff -ur linux-5.15.0-47/fs/overlayfs/super.c linux-5.15.0-48/fs/overlayfs/super.c
--- linux-5.15.0-47/fs/overlayfs/super.c	2022-09-22 15:52:28.806462177 +0100
+++ linux-5.15.0-48/fs/overlayfs/super.c	2022-09-22 15:53:35.962351879 +0100
@@ -873,7 +873,7 @@
 		pr_err("filesystem on '%s' not supported\n", name);
 		goto out_put;
 	}
-	if (mnt_user_ns(path->mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path->mnt)) {
 		pr_err("idmapped layers are currently not supported\n");
 		goto out_put;
 	}

netlore avatar Sep 23 '22 16:09 netlore

This seems to show a related check in overlayfs, being reverted to mainline, could this be the smoking gun?

diff -ur linux-5.15.0-47/fs/overlayfs/super.c linux-5.15.0-48/fs/overlayfs/super.c
--- linux-5.15.0-47/fs/overlayfs/super.c	2022-09-22 15:52:28.806462177 +0100
+++ linux-5.15.0-48/fs/overlayfs/super.c	2022-09-22 15:53:35.962351879 +0100
@@ -873,7 +873,7 @@
 		pr_err("filesystem on '%s' not supported\n", name);
 		goto out_put;
 	}
-	if (mnt_user_ns(path->mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path->mnt)) {
 		pr_err("idmapped layers are currently not supported\n");
 		goto out_put;
 	}

Hi @netlore, apologies for the late reply. I don't think that's the culprit because it's related to ID-mapped-mounts rather than shiftfs itself.

Below is the list of patches to overlayfs that are required to make it work with shiftfs. I didn't check if the 5.15.0-48 kernel is missing any of these.

07648d68cea786d2ff599b51139013044ec59a8a   (05/16/22 - UBUNTU: SAUCE: overlayfs: prevent dereferencing struct file in ovl_vm_prfile_set())                                                                                                                                                                                     
b07bc17b8363190be1328fe162768f7fdcb8fcaa   (04/14/22 - UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files)                                                                                                                                                                                          
730264093da28294476d5c41b055a271facdd998   (10/02/19 - UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay)                                                                                                                                                                                                               
3fb38c98e060b327cb58373775dcc95ed52d1f22   (01/19/16 - UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs)                                                                                                                                                                                      
796fe8290349ef4cd8719a68966893c1c1b5a677   (01/12/22 - UBUNTU: SAUCE: vfs: test that one given mount param is not larger than PAGE_SIZE)                                                                                                                                                                                

ctalledo avatar Sep 28 '22 21:09 ctalledo

FYI: kernel 5.19 seems to work: https://github.com/nestybox/sysbox/issues/595#issuecomment-1255906485

ctalledo avatar Sep 28 '22 22:09 ctalledo

Downgrading to 5.15.0-47 fixed the issue for me as well

fuomag9 avatar Oct 02 '22 22:10 fuomag9

What are the consequences in temporarily disabling shiftfs? Do we need to recreate volumes, mounted directories? Will it affect file ownerships ?

Thanks for the help

drakes00 avatar Oct 03 '22 10:10 drakes00

Hi @drakes00,

What are the consequences in temporarily disabling shiftfs?

There should be minor negative consequences (see below), but there is no need to recreate volumes, mounted dirs, etc., and it won't affect file ownership on the host machine either.

The only thing is that without shiftfs your kernel must support ID-mapped-mounts, which works in a lot of cases but not all (it's improving fast though).

One area were ID-mapped-mounts did not work until recently is compatibility with overlayfs (which is important since Docker sets up the container's rootfs with overlayfs). Due to this incompatibility we added a work-around in sysbox where if shiftfs is not present, it chowns the container's rootfs when the container starts; that's not ideal but it works.

I believe kernel 5.19 added ID-mapped-mount support for overlayfs (need to double-check). If true, then we will adjust sysbox to use ID-mapped-mounts for the container's rootfs too, and at that point ID-mapped-mounts would essentially replace shiftfs for all practical purposes.

Hope that helps.

A bit more info on this in the sysbox user guide doc.

ctalledo avatar Oct 03 '22 15:10 ctalledo

Hey there - any update on this issue? Running into the same on a recently updated 22.04.1 LTS ubuntu system

selina@cirl-mrt-1:~$ uname -a Linux cirl-mrt-1 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux docker run -d -p 1880:1880 --runtime=sysbox-runc -v /home/selina/node_red_data:/data --name mynodered localnodered 712be8c1aa6567b56b529bd48c1bd5a0cef2e3d1866d0c6c67b9ed885101c3ac docker: Error response from daemon: OCI runtime create failed: container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: process_linux.go:578: handleReqOp caused: rootfs_init_linux.go:366: failed to mkdirall /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type caused: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type: unknown.

sfph avatar Oct 25 '22 20:10 sfph

Hey there - any update on this issue? Running into the same on a recently updated 22.04.1 LTS ubuntu system

selina@cirl-mrt-1:~$ uname -a Linux cirl-mrt-1 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux docker run -d -p 1880:1880 --runtime=sysbox-runc -v /home/selina/node_red_data:/data --name mynodered localnodered 712be8c1aa6567b56b529bd48c1bd5a0cef2e3d1866d0c6c67b9ed885101c3ac docker: Error response from daemon: OCI runtime create failed: container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: process_linux.go:578: handleReqOp caused: rootfs_init_linux.go:366: failed to mkdirall /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type caused: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type: unknown.

Hi, kernel 6.0.0 solved the issue on my end. Cheers

% uname -a                                                                                                                            
Linux Ma1X-Os-X-n3zu 6.0.0-060000-generic #202210022231 SMP PREEMPT_DYNAMIC Sun Oct 2 22:35:09 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
% sysbox-runc -v                                                                                                                      
sysbox-runc
        edition:        Community Edition (CE)
        version:        0.5.2
        commit:         d91c42c2125fd7aaf46f66307eb5c2a025f30289
        built at:       Wed May 18 19:49:04 UTC 2022
        built by:       Rodny Molina
        oci-specs:      1.0.2-dev

drakes00 avatar Oct 25 '22 21:10 drakes00

Hi @sfph,

Hey there - any update on this issue? Running into the same on a recently updated 22.04.1 LTS ubuntu system selina@cirl-mrt-1:~$ uname -a Linux cirl-mrt-1 5.15.0-52-generic https://github.com/nestybox/sysbox/issues/58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Unfortunately there isn't much we can do with Ubuntu kernels 5.15.(>=48) as they are apparently missing a Ubuntu-patch on overlayfs that breaks interaction with shiftfs.

If you can, please upgrade to newer kernels (e.g., 5.19, 6.0, etc.).

If you must use kernel 5.15, try using 5.15.47 or earlier.

If you must use kernel 5.15.(>=48), you can work-around the problem by either:

  1. Removing the shiftfs module from the kernel (e.g., rmmod) or

  2. Configuring Sysbox to not use shiftfs. You do this by configuring the systemd service unit for sysbox-mgr, and passing the --disable-shiftfs flag to Sysbox. See here for more.

Hope that helps!

ctalledo avatar Oct 25 '22 23:10 ctalledo

This does help; thanks!

sfph avatar Oct 25 '22 23:10 sfph

I wonder if it's worthwhile to add a check in the sysbox-mgr to automatically disable using shiftfs in known broken kernels.

felipecrs avatar Oct 31 '22 14:10 felipecrs

Here's a one-liner to disable shiftfs:

sudo mkdir -p /etc/systemd/system/sysbox-mgr.service.d && printf '%s\n' '[Service]' 'ExecStart=' 'ExecStart=/usr/bin/sysbox-mgr --disable-shiftfs' | sudo tee /etc/systemd/system/sysbox-mgr.service.d/override.conf && sudo systemctl daemon-reload && sudo systemctl restart sysbox

felipecrs avatar Oct 31 '22 14:10 felipecrs

I believe kernel 5.19 added ID-mapped-mount support for overlayfs (need to double-check). If true, then we will adjust sysbox to use ID-mapped-mounts for the container's rootfs too, and at that point ID-mapped-mounts would essentially replace shiftfs for all practical purposes.

Is there somewhere I can track changes in regards to sysbox/idmapped/kernel 5.19 or is there any roadmap for this to be included natively in docker now nestybox has been acquired?

pmb-nolwenture avatar Nov 02 '22 10:11 pmb-nolwenture

Hi @philipzgithub, assuming that in fact overlayfs supports ID-mapped-mounts, this will be included in the ~v0.7 release of Sysbox. Not sure on the timeline yet, likely ~Feb 2022.

In any case, overlayfs support for ID-mapped-mounts is a "nice-to-have", but not a "must-have" as mentioned in my comment above.

ctalledo avatar Nov 02 '22 16:11 ctalledo

Thank you @ctalledo for your response, you have saved me a lot of time and effort.

pmb-nolwenture avatar Nov 03 '22 06:11 pmb-nolwenture

the regression has been filed as a bug here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1990849

pmb-nolwenture avatar Nov 04 '22 09:11 pmb-nolwenture

the regression has been filed as a bug here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1990849

That's great, thanks for digging that up @philipzgithub.

ctalledo avatar Nov 04 '22 15:11 ctalledo

Here's a one-liner to disable shiftfs:

printf '%s\n' '[Service]' 'ExecStart=' 'ExecStart=/usr/bin/sysbox-mgr --disable-shiftfs' | sudo tee /etc/systemd/system/sysbox-mgr.service.d/override.conf && sudo systemctl daemon-reload && sudo systemctl restart sysbox

FYI, you might need to to create /etc/systemd/system/sysbox-mgr.service.d first. Otherwise, this worked for me, thanks!

ScottG489 avatar Nov 10 '22 16:11 ScottG489

Oh yeah, I edited it. Thanks!

felipecrs avatar Nov 10 '22 16:11 felipecrs

@felipecrs

I wonder if it's worthwhile to add a check in the sysbox-mgr to automatically disable using shiftfs in known broken kernels.

This is a good idea, especially with so many kernel changes going on these days in this area. During sysbox-mgr's initialization we could attempt to mount a shiftfs resource and decide to enable/disable shiftfs based on this.

rodnymolina avatar Nov 10 '22 17:11 rodnymolina

I wonder if it's worthwhile to add a check in the sysbox-mgr to automatically disable using shiftfs in known broken kernels.

It's not trivial to implement though, because testing whether shiftfs-on-overlayfs works requires mounting shiftfs, and that in turn requires the process enter a new user-namespace, and that requires UID mappings, and so on ...

ctalledo avatar Nov 10 '22 19:11 ctalledo

My initial thought was to simply check if the kernel is Ubuntu >=5.15.0-58...

felipecrs avatar Nov 10 '22 19:11 felipecrs

My initial thought was to simply check if the kernel is Ubuntu >=5.15.0-58...

It's broken since 5.15-0-48 I believe, and I believe in 5.17 and possibly 5.19 too; we don't know when the fix is coming so it's hard to tie it to a kernel version.

ctalledo avatar Nov 10 '22 19:11 ctalledo

Updating to 5.19.0-28 fixed the problem for me :)

bokenator avatar Jan 17 '23 03:01 bokenator

FYI: commit with the fix for shiftfs in Ubuntu: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/lunar/commit/fs/shiftfs.c?h=master-next&id=cfe3544e11cc53e0038410a2199ee6afeea3687f

Should be present in the upcoming Ubuntu 23.04 release (Lunar Lobster), due April 2023.

NOTE: the upcoming release of Sysbox (v0.6.0) will automatically check if shiftfs works on the host or not, and adjust accordingly. In platforms where it works, it will use it as needed. In platforms where it does not work, it will use an alternative mechanism. The new Sysbox release will also automatically check if the kernel supports ID-mapped mounts (kernel 5.12+) and overlayfs on ID-mapped mounted lower dirs (kernel 5.19+), and use both of these features. The latter one really makes shiftfs unnecessary going forward.

ctalledo avatar Feb 21 '23 05:02 ctalledo