kairos
kairos copied to clipboard
Implement systemd-boot boot assessment
systemd-boot has a was to perform boot assessment and fallback to other entries if booting fails. It is described in detail here and here. It's not very complicated and only requires us to name the conf/efi files in a certain way and also make sure we order entries properly (so that the right one is picked as a fallback).
Note: Originally investigated while documenting how Kairos does boot assessment,
I can help test this when someone is ready for testing.
I was also thinking about how does the system move from failed active AND passive into recovery or reset.
Right now recovery requires human intervention and doesn't load any sysext options, so it has to be pretty bare bones as we are keeping UKI images small. I was thinking about building an auto update script for recovery that runs and tries to fix active/passive by running an upgrade and/or checks a HTTPS website for instructions. It would then not auto update the systemd-boot count for recovery, and instead let active/passive successfully booting reset the count for recovery. This would make sure that if recovery fails to recover the system after X attempts, a reset is triggered which hopefully can do a better job setting every right and blowing away filesystems to clean it up.
Planning decision:
Let's implement the default fallback mechanism of systemd first and then see if we can implement the auto-reset feature using stages and such (extract to different ticket when the first part is done)
Being able to auto-reset a system that doesn't boot make sense, especially in cases like:
- System without users (https://github.com/kairos-io/kairos/issues/2921)
- Remote systems with no way for someone to visit and fix
with the given patch it seems to work BUT
- we are missing the systemd-bless-boot service and binary which changes the tries left/used so after 3 boots the entries are marked as bad
- even if we make that work, it will not work because we mount the efi partition RO
2 possible outcomes:
- Mount EFI as RW during initramfs, remount it as RO at the end of the UKI boot process
- Create our own service that remounts as RW, changes the current entry (mark good basically) and remounts RO
thoughts @kairos-io/maintainers
Basically this is the expected workflow of the boot assesment for reference: https://systemd.io/AUTOMATIC_BOOT_ASSESSMENT/
Important part below
Let’s say the second boot succeeds. The kernel initializes properly, systemd is started and invokes all generators.
One of the generators started is systemd-bless-boot-generator which detects that boot counting is used.
It hence pulls systemd-bless-boot.service into the initial transaction.
systemd-bless-boot.service is ordered after and Requires= the generic boot-complete.target unit.
This unit is hence also pulled into the initial transaction.
The boot-complete.target unit is ordered after and pulls in various units that are required to succeed for the boot process to be considered successful.
One such unit is systemd-boot-check-no-failures.service.
systemd-boot-check-no-failures.service is run after all its own dependencies completed, and assesses that the boot completed successfully. It hence exits cleanly.
This allows boot-complete.target to be reached. This signifies to the system that this boot attempt shall be considered successful.
Which in turn permits systemd-bless-boot.service to run. It now determines which boot loader entry file was used to boot the system, and renames it dropping the counter tag. Thus 4.14.11-300.fc27.x86_64+1-2.conf is renamed to 4.14.11-300.fc27.x86_64.conf. From this moment boot counting is turned off for this entry.
Mount EFI as RW during initramfs, remount it as RO at the end of the UKI boot process
I dont think this works for us, as we need to wait for the boot-complete.target which will happen in userspace instead of initramfs.
We could also have a manual service that runs after systemd multi-user.target
- pre: mounts EFI as RW (its already mounted as RO)
- runs the bless boot binary manually
- post: remounts EFI as RO
with the given patch it seems to work BUT
* we are missing the systemd-bless-boot service and binary which changes the tries left/used so after 3 boots the entries are marked as bad * even if we make that work, it will not work because we mount the efi partition RO2 possible outcomes:
* Mount EFI as RW during initramfs, remount it as RO at the end of the UKI boot process
mmh complex but doable, the only challenge I see there is to fire the systemd services exactly in that timeframe, not sure if possible if not by calling systemd-bless-boot inside immucore
* Create our own service that remounts as RW, changes the current entry (mark good basically) and remounts RO
That looks the most saner solution at this point, however, my only concern here is if systemd-bless-boot will get more business logic from systemd that we might miss. Wouldn't be at this point equivalent to call systemd-bless-boot from immucore directly?
mmh complex but doable, the only challenge I see there is to fire the systemd services exactly in that timeframe, not sure if possible if not by calling systemd-bless-boot inside immucore
Yeah after a deeper checking this wont work as the bless is once the system is fully up, so in userspace once systemctl reports everything as running. Out of immucore control unfortunately
* Create our own service that remounts as RW, changes the current entry (mark good basically) and remounts ROThat looks the most saner solution at this point, however, my only concern here is if systemd-bless-boot will get more business logic from systemd that we might miss. Wouldn't be at this point equivalent to call systemd-bless-boot from immucore directly?
Seems like we may be able to do it ourselves by just calling the binary. So mimicking the bless service but with extra steps. Maybe even with a simple override to run pre and post for the mounts. So we dont need to reimplement the whole thing
Maybe even with a simple override to run pre and post for the mounts. So we don't need to reimplement the whole thing
That was exactly what I was thinking. We need to modify the path for systemd-bless-boot anyway since we don't use /boot
Maybe changing systemd-bless-boot.service with an override file to have something like:
[Service]
# Remount /efi as read-write before starting the main service
ExecStartPre=/usr/bin/mount -o remount,rw /efi
# Modify ExecStart to include --path=/efi
ExecStart=/usr/bin/systemd-bless-boot good --path=/efi
# Remount /efi as read-only after the service completes
ExecStartPost=/usr/bin/mount -o remount,ro /efi
Maybe even with a simple override to run pre and post for the mounts. So we don't need to reimplement the whole thing
That was exactly what I was thinking. We need to modify the path for
systemd-bless-bootanyway since we don't use/bootMaybe changing systemd-bless-boot.service with an override file to have something like:
[Service] # Remount /efi as read-write before starting the main service ExecStartPre=/usr/bin/mount -o remount,rw /efi # Modify ExecStart to include --path=/efi ExecStart=/usr/bin/systemd-bless-boot good --path=/efi # Remount /efi as read-only after the service completes ExecStartPost=/usr/bin/mount -o remount,ro /efi
I actually tested this with overrides for mounting unmounting the partition and it worked as expected. I think it gets the path automatically either from identifying the partition type or from the systemd-boot efivars but it do actually works as expected
With this overrider the boot-bless service works
### /etc/systemd/system/systemd-bless-boot.service.d/override.conf
[Service]
ExecStartPre=mount -o remount,rw /efi
ExecStartPost=mount -o remount,ro /efi
Notice that we also need to override another service, the boot-random-seed as that its automatically brought and needs write access to efi
### /etc/systemd/system/systemd-boot-random-seed.service.d/override.conf
[Service]
ExecStartPre=mount -o remount,rw /efi
ExecStartPost=mount -o remount,ro /efi
there is still an issue but we can workaround it with this
[Service]
ExecStartPre=mount -o remount,rw /efi
ExecStartPost=sed -i -E 's/(default\s+)*\+[0-9]+(-[0-9]+)?(\.conf)/\1\3/' /efi/loader/loader.conf
ExecStartPost=mount -o remount,ro /efi
So on our loader.conf we set the specific config that we want to run, so for example active.conf. With boot assessment this is automatically set to something like active+3.conf
The main problem is, that when bless-boot marks a config as good after booting, it renames it to remove the boot assessment, as its marked as good, so active+3.conf turns into active.conf. But the loader.conf is not updated, so its still pointing to active+3.conf which doesnt match the actual config. There is glob support in the default stanza, but that its not good enough in the case we have extra efis with different cmdlines as we want to match the name or the name+boot assessment not a greedy match which could lead to picking activeBad.conf
So to fix that, we can use the service itself to remove any mentions of the boot assessment part in the loader.conf with sed :D
I tested this with an active+3.conf which turns into active+2-1.conf on the first boot due how assesment works, then bless-boot triggered and marked it as good, changing the conf to active.conf. Then sed removed the +3 part from the loader.conf entry correctly.
I think we can work with this. I will test it further but seems to work as expected.
Moving pieces needed to fully implement this:
- overrides for
systemd-bless-bootandsystemd-boot-random-seedto remount efi as RW in the default static files under packages https://github.com/kairos-io/packages/pull/1149 - missing package in base images
systemd-bootfor ubuntu in kairos and enabling thesystemd-bless-bootservice https://github.com/kairos-io/kairos/pull/3034 - agent changes to add boot assessment to config files on install, upgrade and reset (ongoing, install and upgrade done: https://github.com/kairos-io/kairos-agent/pull/604)
mostly done, only agent PR missing merge and then we can test it once its on the framework and such but locally testing it seems to work as expected
All merged. created follow ups:
https://github.com/kairos-io/kairos/issues/3041 https://github.com/kairos-io/kairos/issues/3040