ostree icon indicating copy to clipboard operation
ostree copied to clipboard

Make deployment finalization do less things

Open jlebon opened this issue 2 years ago • 9 comments

Right now, a lot of work happens at finalization time:

  • /etc merge
  • SELinux policy recompilation
  • copying large files into the bootfs
  • deleting old deployments
  • ostree repo pruning
  • updating the bootloader

The issue with this is that:

  1. a lot of these operations are fallible; the failure mode is confusing and wastes time (reboot into the same deployment)
  2. it makes reboots slower

This is a tracker for ideas on how to thin it out.

jlebon avatar Apr 20 '22 15:04 jlebon

copying large files into the bootfs

For this, we could preemptively copy the kernel and initramfs to the bootfs. The downside of this is that it's unnecessary I/O in setups where the update driver could be restaging deployments multiple times before actually rebooting. So we probably want to make it configurable.

jlebon avatar Apr 20 '22 15:04 jlebon

  • deleting old deployments
  • ostree repo pruning

For this, see https://github.com/ostreedev/ostree/issues/2510.

jlebon avatar Apr 20 '22 15:04 jlebon

  • deleting old deployments

Regarding this: With #1419 you don't end up with old deployments - old ones are updated with the new content. This is faster twice - the checkout is fast, and then the old one doesn't need to be deleted.

wmanley avatar Apr 20 '22 16:04 wmanley

  • /etc merge
  • SELinux policy recompilation

Random half-baked idea for this:

  • add an ExecStart to ostree-finalize-staged.service which installs inotify watchers for /etc and e.g. bumps a stamp file in /run or bumps the mtime of /etc itself whenever a modification happens
  • do /etc merge and SELinux policy recompilation at staging time
  • at finalization time:
    • if /etc hasn't been changed since staging, skip /etc merge and SELinux policy recompilation
    • if /etc changed, redo /etc merge and SELinux policy recompilation. Retain the recompiled SELinux policy before rerunning semodule: if the /etc change doesn't affect SELinux, we'll save on recompilation

jlebon avatar Apr 20 '22 16:04 jlebon

We just got another bug https://bugzilla.redhat.com/show_bug.cgi?id=2075126 that is because something is mutating /etc at the same time ostree-finalize-staged is running - I'm thinking really the direction we need to go instead is moving this into the initramfs at shutdown time (or I guess in theory, at bootup time would work too).

There's a core tension here between "prepare while system is online and running" and "avoid race conditions".

cgwalters avatar Apr 25 '22 20:04 cgwalters

It seems not completely fail-safe when finalization is interuptted. I met an issue that grub.cfg turned empty.

cheese avatar Jul 22 '22 08:07 cheese

It seems not completely fail-safe when finalization is interuptted. I met an issue that grub.cfg turned empty.

Please file a separate issue for this.

jlebon avatar Jul 22 '22 15:07 jlebon

Hmm, or maybe simplest is to just do the /etc merge from the initramfs when booting into the new deployment (I think you may have suggested that at some point).

There's optimizations we could do on that, like still doing a preliminary /etc merge at deployment time and storing a dirhash somewhere to know if we don't have to do it again.

jlebon avatar Aug 24 '22 14:08 jlebon

We just got another bug bugzilla.redhat.com/show_bug.cgi?id=2075126 that is because something is mutating /etc at the same time ostree-finalize-staged is running

Hmm weird, https://github.com/openshift/machine-config-operator/pull/2414 should've fixed this on OCP for containerized workloads.

jlebon avatar Aug 24 '22 14:08 jlebon