ostree icon indicating copy to clipboard operation
ostree copied to clipboard

add greenboot-like functionality integrated here

Open cgwalters opened this issue 2 years ago • 6 comments

Having some discussions on https://pagure.io/fedora-iot/greenboot and I think we should actually fold this functionality into ostree proper. As is the hooks it makes into the "control loop" of OS upgrades are extremely complicated.

In particular I'd like ostree admin status to have information about things like boot success etc.

I think we can take a new approach here:

  • Write the core logic in Rust in https://github.com/ostreedev/ostree-rs-ext/ (maybe, or we could keep it in C here)
  • UEFI only using BootNext ?
  • A very minimal systemd unit which runs on the next boot (in fact we already have one, xref https://github.com/ostreedev/ostree/pull/2589 ) that marks a successful state
  • Health checks based on e.g. systemd unit startup success can just be implemented by ordering against that success unit as a defined API; user logic chooses to then e.g. reboot back into the previous deployment

cgwalters avatar Oct 03 '22 16:10 cgwalters

One part of this is probably that OSTree-based distros' default target should probably be systemd's boot-complete.target going forward (instead of e.g. the currently mostly used multi-user.target).

This was also touched upon in the recently held Image-based Linux summit, see: https://github.com/uapi-group/docs/blob/main/minutes/2022-10-05__Image-based-linux-summit.md#prior-art and more generally https://uapi-group.org/

LorbusChris avatar Oct 27 '22 22:10 LorbusChris

This part of the boot specification might be relevant too to avoid creating two different mechanisms: https://github.com/uapi-group/specifications/blob/main/specs/boot_loader_specification.md#boot-counting

travier avatar Oct 28 '22 10:10 travier

The boot counting mechanism described in the spec and used by sd-boot (i.e. putting the counter in the boot loader entry file) was deemed not implementable with grub, so greenboot implemented its own via a grub snippet and grub env vars. That's what's used by Fedora IoT / RHEL for Edge today. So we already have two different mechanisms unfortunately.

LorbusChris avatar Oct 28 '22 14:10 LorbusChris

I am working on the re-write of the greenboot in rust, I am looking for options of how to differentiate a regular reboot vs reboot post upgrade/rollback. @cgwalters as a starter in ostree would like to understand how to

we should actually fold this functionality into ostree proper

And how different it is from existing greenboot.

say-paul avatar Mar 20 '23 10:03 say-paul

I am looking for options of how to differentiate a regular reboot vs reboot post upgrade/rollback.

I'd say here we should use the ostree= kernel argument as a source of truth. Today there's multiple things that log this that we could find in the journal:

[root@cosa-devsh ~]# journalctl --grep=ostree=  -o cat |cat
Command line: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ignition.platform.id=qemu console=tty0 console=ttyS0,115200n8 ignition.firstboot ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
Kernel command line: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ignition.platform.id=qemu console=tty0 console=ttyS0,115200n8 ignition.firstboot ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
Unknown kernel command line parameters "BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0", will be passed to user space.
    ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
Using kernel command line parameters:  ip=auto   BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ignition.platform.id=qemu console=tty0 console=ttyS0,115200n8 ignition.firstboot ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
[root@cosa-devsh ~]# journalctl --grep=ostree= | more
Mar 20 14:12:19 localhost kernel: Command line: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ignition.platform.id=qemu console=tty0 console=ttyS0,115200n8 ignition.firstboot ostree=/ostre
e/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
Mar 20 14:12:19 localhost kernel: Kernel command line: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ignition.platform.id=qemu console=tty0 console=ttyS0,115200n8 ignition.firstboot ostree
=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
Mar 20 14:12:19 localhost kernel: Unknown kernel command line parameters "BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8
ba72cbcfd865b42b77396813/0", will be passed to user space.
Mar 20 14:12:19 localhost kernel:     ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0
Mar 20 14:12:19 localhost dracut-cmdline[413]: Using kernel command line parameters:  ip=auto   BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/vmlinuz-5.14.0-282.el9.x86_64 ignition.platform.id=qemu console=tty0 console
=ttyS0,115200n8 ignition.firstboot ostree=/ostree/boot.1/rhcos/8bb3298191b10a91e3d87a8f67872865cb6d42a8ba72cbcfd865b42b77396813/0

There's also journalctl -u ostree-prepare-root which logs Mar 20 14:12:22 localhost ostree-prepare-root[1132]: Resolved OSTree target to: /sysroot/ostree/deploy/rhcos/deploy/350495a02a76b33ab9436d5eeca7328417683292184d9e1829fb4268ff78c7cc.0

We could adjust that one to log this as structured data.

Now, not every system will have a persistent journal. I could imagine that we record a "last booted" field somewhere in ostree associated with a deployment. That'd be cheaper and more reliable to parse.

cgwalters avatar Mar 20 '23 14:03 cgwalters

And how different it is from existing greenboot.

I think this is covered by the initial comment right?

Having some discussions on https://pagure.io/fedora-iot/greenboot and I think we should actually fold this functionality into ostree proper. As is the hooks it makes into the "control loop" of OS upgrades are extremely complicated.

In particular I'd like ostree admin status to have information about things like boot success etc.

cgwalters avatar Mar 20 '23 14:03 cgwalters