Add or doc support for making `/var` transient
We should really support this; I think one could probably hack it up with a post-switchroot systemd unit that just mounts a tmpfs for /var, but I suspect at least some people will want to spool to the real filesystem for large data and not be limited by RAM (ok modulo enabling file-backed swap on the real root, which we should also definitely add support for...yeah see below).
Anyways, it'd be easy to add var.transient to go alongside our existing etc.transient and root.transient.
Tangent: automatic swap file support
Swap partitions are well known and supported. However, there's also swap files because basically dealing with partitions stinks - especially resizing.
It might be nice if we supported something like... /sysroot/autoswap or /sysroot/bootc/swap that we detect in the initramfs and if present we enable by default. Although we'd also need to expose a nice declarative way to initialize that at install time as embedded in a container, probably in the install config? Hmm though there's no reason not to support changing it "day 2". In the general case of course one can just do this manually in the initramfs, but customizing the initramfs in this way has a higher bar and is more likely to break. I could imagine this being a default part of dracut...something like a rd.swap=
Just an idea: I believe that making /var transient also helps integrating dm-verity. I know this is not the main target of bootc, but it would be a nice plus... The main blocker for dm-verity is the fact that /var gets remounted rw at boot by an ostree systemd unit. If we make it transient (or simply avoid the rw part), then we could theoretically:
- Provide a root disk integrity verification environment for applications like confidential containers (container running in a VM, when container dies also the VM & its data dies). Therefore mounting a tempfs on /var is more than enough
- When mounting a LUKs partition/disk into /var, it would be possible to achieve root disk integrity + encrypted persistent data. This would be useful for generic confidential VMs.
container running in a VM, when container dies also the VM & its data dies
For this use case though wouldn't it make more sense to launch from say a virtiofs that's a read-only export of the data maintained on the host...and then we mount a transient overlayfs over that whole thing or so?
If I understand correctly what you mean, then it unfortunately cannot work, as far as I can tell. Spoiler alert: I might be digressing a bit from bootc.
The confidential containers project runs on Kata containers, and this technology consists in running a container in a VM. It also has two ways of deploying:
- local hypervisor (where your suggestions could be theoretically applied): basically as of now is nested virtualization, when the host/worker node is a VM itself (which is often the case in kubernetes)
- remote hypervisor: when nested is not available, or the confidential hardware is missing in the host, it is possible to spawn a VM parallel to the host (is if host is a L1 VM, the container will run in another L1 VM). This is useful for cloud and I suppose your advice cannot be applied.
So generally the way you proposed can't be applied on confidential containers.