runc 1.0.1 and containerd 1.6.1
Includes the fix for https://nvd.nist.gov/vuln/detail/CVE-2022-23648
Note that highlighting in GitHub diff is a bit off because of the lines that start with - in the .patch file.
This is failing because of its dependency on deprecated notary trust. #2566 fixes this. Once that is merged in, we can rebase on that and merge this one in.
Rebased now that #2566 is merged in. Let's let CI run, and we can resolve any issues that arise.
Kick off the tests
Seems like Eden shows that this fails to onboard, which means it has some rather basic issues to boot??
Kick off the tests
Seems like Eden shows that this fails to onboard, which means it has some rather basic issues to boot??
Inside Eden workflow artifacts from the bottom of the page I can see inside console.log:
getty: cmdline has console=hvc0 but /dev/hvc0 is not a character device; not starting getty for hvc0
getty: cmdline has console=hvc0 but /dev/hvc0 is not a character device; not starting getty for hvc0
................ .............. ................
................ ............ ................
.... ......... ....
................ ....... ................
................ ..... ................
... . ....
................ ................
............... ................
Edge Virtualization Engine
linuxkit-525400123456 login: root (automatic login)
[ 13.113270][ C0] random: fast init done
EVE is Edge Virtualization Engine
Take a look around and don't forget to use eve(1).
login[311]: root login on 'ttyS0'
linuxkit-525400123456:~# [6n[ 20.360621][ T555] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation
[ 20.379325][ T561] IPVS: ftp: loaded support on port[0] = 21
2022-03-22T12:22:32Z,onboot.000-rngd;2022/03/22 12:22:32 No random source available
[ 23.492379][ T634] leds_siemens_ipc127: No SIMATIC IPC127E detected.
[ 23.580800][ T634] leds_siemens_ipc127: No SIMATIC IPC127E detected.
[ 24.231968][ C0] random: crng init done
[ 25.721489][ T689] spl: loading out-of-tree module taints kernel.
[ 25.748109][ T689] znvpair: module license 'CDDL' taints kernel.
[ 25.748407][ T689] Disabling lock debugging due to kernel taint
[ 27.254319][ T689] ZFS: Loaded module v2.1.2-1, ZFS pool version 5000, ZFS filesystem version 5
[ 27.722856][ T717] ZFS: Unloaded module v2.1.2-1
2022-03-22T12:22:36Z,onboot.003-storage-init.out;2022-03-22T12:22:36,605031418+00:00 No separate /config partition
2022-03-22T12:22:39Z,onboot.003-storage-init.out;2022-03-22T12:22:39,412408417+00:00 No separate /persist partition
2022-03-22T12:22:39Z,onboot.003-storage-init.out;Can not determine persist filesystem type
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=CONFIG'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGA'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGB'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:38Z,onboot.003-storage-init;Failed to initialize the libzfs library.
2022-03-22T12:22:39Z,onboot.003-storage-init;/storage-init.sh: line 15: can't open /run/eve.persist_type: no such file
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 301: can't create /persist/SMART_details.json: Read-only file system
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 302: can't create /persist/SMART_details_previous.json: Read-only file system
[33mWARN[0m[0001] deprecated version : `1`, please switch to version `2`
containerd: mkdir /persist/containerd: read-only file system
seems we cannot go through storage-init and we have no EVE services started
If I understood storage-init correctly, it is responsible for finding the proper partitions for /config and /persist and mounting them, since the root filesystem is read-only and should be a separate partition.
It finds those by running findfs PARTLABEL=CONFIG and findfs PARTLABEL=P3. If those fail, it cannot determine them.
From these lines:
2022-03-22T12:22:36Z,onboot.003-storage-init.out;2022-03-22T12:22:36,605031418+00:00 No separate /config partition
2022-03-22T12:22:39Z,onboot.003-storage-init.out;2022-03-22T12:22:39,412408417+00:00 No separate /persist partition
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=CONFIG'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGA'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGB'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:38Z,onboot.003-storage-init;Failed to initialize the libzfs library.
2022-03-22T12:22:39Z,onboot.003-storage-init.out;Can not determine persist filesystem type
2022-03-22T12:22:39Z,onboot.003-storage-init;/storage-init.sh: line 15: can't open /run/eve.persist_type: no such file
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 301: can't create /persist/SMART_details.json: Read-only file system
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 302: can't create /persist/SMART_details_previous.json: Read-only file system
It looks like every run of findfs failed.
It is unclear to me why. I definitely can run it from getty, which also is running onboot, so something must be different. Also, /config is mounted, so something is mounting it.
More to investigate.
It is unclear to me why. I definitely can run it from
getty, which also is runningonboot, so something must be different. Also,/configis mounted, so something is mounting it.
getty is inside init section: https://github.com/lf-edge/eve/blob/57ae80ee09684a002658a5c5f64d15f0c6b267e6/images/rootfs.yml.in#L8
I have updated this to a more modern version of linuxkit/init. This in turn actually does work in terms of findfs working. However, we get permissions errors in trying to fsck or mount.
The simple fix is to add the devices entries to build.yml for pkg/storage-init. The linuxkit binary does know how to parse those and add them to the runc config.
BUT, the version of linuxkit that does so is more recent, and uses the linuxkit OCI on-disk cache for builds. eve-os does not use that one because of the chained builds issue, which has been open with buildkit for over a year, with no end in sight.
If we move to a recent linuxkit version, we fix this issue, but break chained builds that do not push images to a registry.
Will take some time to try and think of a workaround.
@deitch is this back to the drawing board or is it something which should work and we just need to test it? Can't tell from the comment above.
Not quite the drawing board, but I need to figure out a way to solve a problem.
We know how to fix this: the config.json needs to have the devices section in it. To do that, linuxkit needs to generate it, which it does. But only from a certain version onwards, which includes the buildkit cross-platform builder, which cannot do chained builds.
I am going to open a separate tracking issue for the chained builds issue. We can resolve that one there, and then this becomes straightforward.
Can we try to go another way and apply several patches to the current version of linuxkit we use inside the build system?
Not really.
First, we don't want to, because it means maintaining yet another piece of software. Way (way) too much of eve-os is patches on other upstream software; often it was just because we wanted a fix right away, couldn't wait a few weeks for something to get in. Every now and then we try and clean that up because we got out of sync, and it is a pain. I have done a number of these runs myself on eve; it isn't fun.
Second, it isn't that linuxkit itself is out of date; linuxkit just delegates building to docker, which delegates to buildkit (no reason linuxkit couldn't call buildkit directly; we have discussed it). The problem is buildkit, which is changing at a really quick pace. Do we really want to have a fork of that, which we would need to maintain?
First, we don't want to, because it means maintaining yet another piece of software. Way (way) too much of eve-os is patches on other upstream software; often it was just because we wanted a fix right away, couldn't wait a few weeks for something to get in. Every now and then we try and clean that up because we got out of sync, and it is a pain. I have done a number of these runs myself on eve; it isn't fun.
Here you are absolutely right, this is not the best solution. I'm just thinking how to speed up this particular change. In our situation needed patches are already inside the upstream (https://github.com/linuxkit/linuxkit/commit/24db42dd686a69afbc5b01cabb01fdf088d2cf44 and https://github.com/linuxkit/linuxkit/commit/46ea02f65b077a37bd807d25a3a46fdd4eb0bc46), so potentially we will not hit any problems in term of being in sync, we can just use them now and completely remove when we will use version of linuxkit which will include them.
I've been spending quite some time on this. I think I may have figured out how to get around the issue, but it requires fixes at the (complex) buildkit level (I have their agreement, but need to do the work), then the linuxkit level (medium), then the eve build level (easy).
Is this now blocked on your other buildkit/lk PR @deitch ?
Yes.
@deitch can we close this PR?
Yes @giggsoff got it with #2649