Flatcar icon indicating copy to clipboard operation
Flatcar copied to clipboard

Support rebooting (e.g. for updates) via kexec

Open invidian opened this issue 4 years ago • 8 comments

Current situation In some situations rebooting machines might take time, which adds up on large fleet of machines (e.g. bare-metal), extending the "maintenance" window.

Ideal future situation [ Please describe the future situation after the improvement was implemented ] Flatcar could make use of kexec for rebooting while updating to optimize reboot time of each machine.

invidian avatar Feb 22 '21 13:02 invidian

The logic to alter the counter of the A/B partition and which one to choose, and to generate the new kernel command line arguments is implemented in GRUB and would have to be reimplemented in userspace. Other than that it's a question on how reliable kexec is with certain hardware.

pothos avatar Feb 22 '21 13:02 pothos

I don't think this is a good use case of kexec in general - regular production systems' state can become very complex (think driver states, DMA I/O queues, etc.). Merely kexecing into a new kernel without going through a reboot for resetting the underlying hardware / devices is likely to introduce inconsistent drivers/hardware state.

t-lo avatar Mar 09 '21 15:03 t-lo

Hmm, I was using kexec on Ubuntu for few years for kernel upgrades on bunch of bare-metal servers and I've never encountered any issues related to it.

invidian avatar Mar 09 '21 17:03 invidian

I don't think this is a good use case of kexec in general - regular production systems' state can become very complex (think driver states, DMA I/O queues, etc.). Merely kexecing into a new kernel without going through a reboot for resetting the underlying hardware / devices is likely to introduce inconsistent drivers/hardware state.

I'm not sure this would be an issue. In the Equinix Metal production environment, we use kexec for many customer OS installations across many hardware types with a wide array of components, firmware, and target kernels to initialize to.

andy-v-h avatar Mar 23 '22 22:03 andy-v-h

So two use cases, one is the reboot and the other is after installation from PXE. Would it work to kexec into GRUB?

pothos avatar Mar 24 '22 19:03 pothos

Actually it may help to work on https://github.com/flatcar-linux/Flatcar/issues/624 first, then we don't have to care about GRUB and kernel parameters so much

pothos avatar Mar 24 '22 19:03 pothos

Some Flatcar and kexec references spotted in the wild: https://twitter.com/joonas_fi/status/1523751306489307136

invidian avatar May 10 '22 14:05 invidian

Let's come up with a helper script that assembles the right kernel cmdline. The tasks are detecting first boot and OEM, evaluating/setting GPT attributes, and fetching the dmverity hash.

If desired we could even move some more logic into the initrd (e.g., first boot and OEM detection doesn't actually need to be done in GRUB and we could append the dmverity hash in a secondary initrd appended to the first cpio instead of doing our hack of overwriting some kernel bytes at an arch-specific offset as done now).

pothos avatar Oct 11 '22 12:10 pothos

@pothos Hey, is anyone currently tackling this? If not, I'd be happy to take ownership of this issue if that works for everyone.

krishjainx avatar Jun 01 '23 23:06 krishjainx