eve icon indicating copy to clipboard operation
eve copied to clipboard

runc 1.0.1 and containerd 1.6.1

Open deitch opened this issue 3 years ago • 18 comments

Includes the fix for https://nvd.nist.gov/vuln/detail/CVE-2022-23648

Note that highlighting in GitHub diff is a bit off because of the lines that start with - in the .patch file.

deitch avatar Mar 21 '22 11:03 deitch

This is failing because of its dependency on deprecated notary trust. #2566 fixes this. Once that is merged in, we can rebase on that and merge this one in.

deitch avatar Mar 21 '22 12:03 deitch

Rebased now that #2566 is merged in. Let's let CI run, and we can resolve any issues that arise.

deitch avatar Mar 22 '22 09:03 deitch

Kick off the tests

Seems like Eden shows that this fails to onboard, which means it has some rather basic issues to boot??

eriknordmark avatar Mar 22 '22 12:03 eriknordmark

Kick off the tests

Seems like Eden shows that this fails to onboard, which means it has some rather basic issues to boot??

Inside Eden workflow artifacts from the bottom of the page I can see inside console.log:

getty: cmdline has console=hvc0 but /dev/hvc0 is not a character device; not starting getty for hvc0
getty: cmdline has console=hvc0 but /dev/hvc0 is not a character device; not starting getty for hvc0


................   ..............   ................
 ................   ............   ................ 
              ....    .........   ....              
    ................   .......   ................   
     ................   .....   ................    
                    ...   .   ....                  
        ................     ................       
          ...............   ................        

              Edge Virtualization Engine
linuxkit-525400123456 login: root (automatic login)


[   13.113270][    C0] random: fast init done
EVE is Edge Virtualization Engine

Take a look around and don't forget to use eve(1).
login[311]: root login on 'ttyS0'
linuxkit-525400123456:~# [6n[   20.360621][  T555] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation
[   20.379325][  T561] IPVS: ftp: loaded support on port[0] = 21
2022-03-22T12:22:32Z,onboot.000-rngd;2022/03/22 12:22:32 No random source available
[   23.492379][  T634] leds_siemens_ipc127: No SIMATIC IPC127E detected.
[   23.580800][  T634] leds_siemens_ipc127: No SIMATIC IPC127E detected.
[   24.231968][    C0] random: crng init done
[   25.721489][  T689] spl: loading out-of-tree module taints kernel.
[   25.748109][  T689] znvpair: module license 'CDDL' taints kernel.
[   25.748407][  T689] Disabling lock debugging due to kernel taint
[   27.254319][  T689] ZFS: Loaded module v2.1.2-1, ZFS pool version 5000, ZFS filesystem version 5
[   27.722856][  T717] ZFS: Unloaded module v2.1.2-1
2022-03-22T12:22:36Z,onboot.003-storage-init.out;2022-03-22T12:22:36,605031418+00:00 No separate /config partition
2022-03-22T12:22:39Z,onboot.003-storage-init.out;2022-03-22T12:22:39,412408417+00:00 No separate /persist partition
2022-03-22T12:22:39Z,onboot.003-storage-init.out;Can not determine persist filesystem type
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=CONFIG'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGA'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGB'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:38Z,onboot.003-storage-init;Failed to initialize the libzfs library.
2022-03-22T12:22:39Z,onboot.003-storage-init;/storage-init.sh: line 15: can't open /run/eve.persist_type: no such file
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 301: can't create /persist/SMART_details.json: Read-only file system
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 302: can't create /persist/SMART_details_previous.json: Read-only file system
[33mWARN[0m[0001] deprecated version : `1`, please switch to version `2` 
containerd: mkdir /persist/containerd: read-only file system

seems we cannot go through storage-init and we have no EVE services started

petr-zededa avatar Mar 22 '22 16:03 petr-zededa

If I understood storage-init correctly, it is responsible for finding the proper partitions for /config and /persist and mounting them, since the root filesystem is read-only and should be a separate partition.

It finds those by running findfs PARTLABEL=CONFIG and findfs PARTLABEL=P3. If those fail, it cannot determine them.

From these lines:

2022-03-22T12:22:36Z,onboot.003-storage-init.out;2022-03-22T12:22:36,605031418+00:00 No separate /config partition
2022-03-22T12:22:39Z,onboot.003-storage-init.out;2022-03-22T12:22:39,412408417+00:00 No separate /persist partition
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=CONFIG'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGA'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=IMGB'
2022-03-22T12:22:36Z,onboot.003-storage-init;findfs: unable to resolve 'PARTLABEL=P3'
2022-03-22T12:22:38Z,onboot.003-storage-init;Failed to initialize the libzfs library.
2022-03-22T12:22:39Z,onboot.003-storage-init.out;Can not determine persist filesystem type
2022-03-22T12:22:39Z,onboot.003-storage-init;/storage-init.sh: line 15: can't open /run/eve.persist_type: no such file
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 301: can't create /persist/SMART_details.json: Read-only file system
2022-03-22T12:22:43Z,onboot.003-storage-init;/storage-init.sh: line 302: can't create /persist/SMART_details_previous.json: Read-only file system

It looks like every run of findfs failed.

It is unclear to me why. I definitely can run it from getty, which also is running onboot, so something must be different. Also, /config is mounted, so something is mounting it.

More to investigate.

deitch avatar Mar 23 '22 05:03 deitch

It is unclear to me why. I definitely can run it from getty, which also is running onboot, so something must be different. Also, /config is mounted, so something is mounting it.

getty is inside init section: https://github.com/lf-edge/eve/blob/57ae80ee09684a002658a5c5f64d15f0c6b267e6/images/rootfs.yml.in#L8

giggsoff avatar Mar 23 '22 13:03 giggsoff

I have updated this to a more modern version of linuxkit/init. This in turn actually does work in terms of findfs working. However, we get permissions errors in trying to fsck or mount.

The simple fix is to add the devices entries to build.yml for pkg/storage-init. The linuxkit binary does know how to parse those and add them to the runc config.

BUT, the version of linuxkit that does so is more recent, and uses the linuxkit OCI on-disk cache for builds. eve-os does not use that one because of the chained builds issue, which has been open with buildkit for over a year, with no end in sight.

If we move to a recent linuxkit version, we fix this issue, but break chained builds that do not push images to a registry.

Will take some time to try and think of a workaround.

deitch avatar Mar 23 '22 18:03 deitch

@deitch is this back to the drawing board or is it something which should work and we just need to test it? Can't tell from the comment above.

eriknordmark avatar Mar 24 '22 19:03 eriknordmark

Not quite the drawing board, but I need to figure out a way to solve a problem.

deitch avatar Mar 24 '22 20:03 deitch

We know how to fix this: the config.json needs to have the devices section in it. To do that, linuxkit needs to generate it, which it does. But only from a certain version onwards, which includes the buildkit cross-platform builder, which cannot do chained builds.

deitch avatar Mar 24 '22 21:03 deitch

I am going to open a separate tracking issue for the chained builds issue. We can resolve that one there, and then this becomes straightforward.

deitch avatar Mar 25 '22 05:03 deitch

Can we try to go another way and apply several patches to the current version of linuxkit we use inside the build system?

giggsoff avatar Mar 31 '22 12:03 giggsoff

Not really.

First, we don't want to, because it means maintaining yet another piece of software. Way (way) too much of eve-os is patches on other upstream software; often it was just because we wanted a fix right away, couldn't wait a few weeks for something to get in. Every now and then we try and clean that up because we got out of sync, and it is a pain. I have done a number of these runs myself on eve; it isn't fun.

Second, it isn't that linuxkit itself is out of date; linuxkit just delegates building to docker, which delegates to buildkit (no reason linuxkit couldn't call buildkit directly; we have discussed it). The problem is buildkit, which is changing at a really quick pace. Do we really want to have a fork of that, which we would need to maintain?

deitch avatar Mar 31 '22 12:03 deitch

First, we don't want to, because it means maintaining yet another piece of software. Way (way) too much of eve-os is patches on other upstream software; often it was just because we wanted a fix right away, couldn't wait a few weeks for something to get in. Every now and then we try and clean that up because we got out of sync, and it is a pain. I have done a number of these runs myself on eve; it isn't fun.

Here you are absolutely right, this is not the best solution. I'm just thinking how to speed up this particular change. In our situation needed patches are already inside the upstream (https://github.com/linuxkit/linuxkit/commit/24db42dd686a69afbc5b01cabb01fdf088d2cf44 and https://github.com/linuxkit/linuxkit/commit/46ea02f65b077a37bd807d25a3a46fdd4eb0bc46), so potentially we will not hit any problems in term of being in sync, we can just use them now and completely remove when we will use version of linuxkit which will include them.

giggsoff avatar Apr 01 '22 09:04 giggsoff

I've been spending quite some time on this. I think I may have figured out how to get around the issue, but it requires fixes at the (complex) buildkit level (I have their agreement, but need to do the work), then the linuxkit level (medium), then the eve build level (easy).

deitch avatar Apr 06 '22 14:04 deitch

Is this now blocked on your other buildkit/lk PR @deitch ?

rvs avatar Apr 07 '22 16:04 rvs

Yes.

deitch avatar Apr 07 '22 16:04 deitch

@deitch can we close this PR?

eriknordmark avatar Oct 14 '22 23:10 eriknordmark

Yes @giggsoff got it with #2649

deitch avatar Oct 16 '22 08:10 deitch