WIP: Add support for `--replace-mode=alongside` for ostree target
Ironically our support for --replace-mode=alongside breaks when we're targeting an already extant ostree host, because when we first blow away the /boot directory, this means the ostree stack loses its knowledge that we're in a booted deployment, and will attempt to GC it...
https://github.com/ostreedev/ostree-rs-ext/pull/550/commits/8fa019bfa821303cfb7a7f069ae2320f4c3800fa is a key part of the fix for that.
However, a notable improvement we can do here is to grow this whole thing into a real "factory reset" mode, and this will be a compelling answer to https://github.com/coreos/fedora-coreos-tracker/issues/399
To implement this though we need to support configuring the stateroot and not just hardcode default.
Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all
Sorry @cgwalters , accidentally pushed the rebase to your fork instead of mine
~EDIT: undid it, continuing my rebase efforts on https://github.com/omertuc/bootc/tree/137clone~
EDIT2: Continuing here
I think you can just take over this PR too if you want, or open a new PR from your fork - either way.
Rebased. Without any changes, I'm facing an issue where in an ostree system, the mounted / on the host system is an overlay (-v mounted into /:/target) and so the findmnt source for it is overlay rather than a /dev/... and so it trips up lsblk later on
I'll see how I can tweak it so that it finds the right device
With additional -v /sysroot:/target -v /sysroot:/target/sysroot mounts instead of -v /:/target and --stateroot foo, this seems to work
@cgwalters thoughts on the above mounts? Do we want to require them for install on ostree targets, or should I figure out a way to make this work without them, using just the already-documented install mounts (i.e. /:/target)?
and so the findmnt source for it is overlay rather than a /dev/... and so it trips up lsblk later on
We should learn how to peel that. This is really the same thing as https://bugzilla.redhat.com/show_bug.cgi?id=2308594 and https://github.com/ostreedev/ostree/issues/3198 and https://github.com/containers/composefs/issues/280
Short term the simplest is the same logic as the grub patch - detect overlayfs for / and check if /sysroot exists and is mounted, if so use that.
and so the findmnt source for it is overlay rather than a /dev/... and so it trips up lsblk later on
We should learn how to peel that. This is really the same thing as bugzilla.redhat.com/show_bug.cgi?id=2308594 and ostreedev/ostree#3198 and containers/composefs#280
Short term the simplest is the same logic as the grub patch - detect overlayfs for
/and check if/sysrootexists and is mounted, if so use that.
OK. Changed it so that when the target rootfs is an overlay, we'll implicitly try targetting <original_target>/sysroot instead.
It wasn't working at first and was a bit of a headache for me to debug because apparently if you mount /:/target then inside the container /target/sysroot is read-only by default, and so ensure_dir_labeled was failing, as opposed to when you mount /sysroot:/target directly, in which case it's not read-only. Took me a while to track that down chasing red herrings, and I'm still not sure who's responsible for this behavior (kernel? podman?), but after I realized it I simply moved ensure_dir_label to run only after your added let _ = crate::utils::open_dir_remount_rw... and then the rest just worked.
Current code might need a bit of touch-ups, but do you think the direction of the code in its current state is good? Should I clean it up and undraft?
It wasn't working at first and was a bit of a headache for me to debug because apparently if you mount /:/target then inside the container /target/sysroot is read-only by default, and so ensure_dir_labeled was failing, as opposed to when you mount /sysroot:/target directly, in which case it's not read-only.
I think that's possibly because it's bootc that's special casing mounting /sysroot read-write - that's how we do it outside of a container at least.
The current experience is:
echo "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOTVytyhnSfX20smAsNKYG5Zpz6vSzDZu22S8PCDJ2Iw omer" > authkeys
sudo podman run --rm --privileged -v $PWD/authkeys:/authkeys -v /dev:/dev -v /var/lib/containers:/var/lib/containers -v /:/target --pid=host --security-opt label=type:unconfined_t -e RUST_LOG=trace quay.io/otuchfel/bootc:latest bootc install to-existing-root --replace alongside --acknowledge-destructive --stateroot foobar --root-ssh-authorized-keys /authkeys
Following that, when I reset, it immediately boots into the new foobar stateroot, there are no grub boot entries for the original one:
I assume this is not our desired experience, and I should look into having the original boot entry preserved?
It's a great question. I think for install --replace=alongside, that is indeed the expectation by default.
Let's then target merging this as is?
I think then what we should look at is weaving in this functionality into https://github.com/containers/bootc/issues/404 right?
That said we may end up wanting to expose this as something like bootc install --retain or something? At least in the ostree case it's trivial to do, we just don't blow away /boot :smile:
The super messy thing is interoperating with non-ostree-ready bootloader setups so --retain would probably have to just error out in that case for now.
IOW:
- Let's merge as is
- Look at
--retainas a followup ?
sgtm
OK so we should have CI coverage here...I think it's actually as easy as doing another install after an initial one without doing the manual "wipe ostree" stuff we have in one of the install tests right?
OK so we should have CI coverage here...I think it's actually as easy as doing another install after an initial one without doing the manual "wipe ostree" stuff we have in one of the install tests right?
Not sure I agree, ideally we should actually boot into the installed ostree system first? But getting reboots to work with the current tests-integration/src/install.rs harness will be tricky
But anyway can't hurt having a test like you suggested, even without a reboot
Yes, testing with reboots is going to require some more infrastructure here.
Yes, testing with reboots is going to require some more infrastructure here.
~I've added a test but as suspected, the test passes even on the main branch, so it's not very helpful for verifying this PR without a proper reboot. Should we keep it anyway?~
~scratch that I'm silly and forgot to remove the call to reset_root~
Unscratch that even after removing the call to reset_root, the test still passes on the main branch :face_with_diagonal_mouth:
We talked about this and realized that while the new test passes, it would pass already today because it's actually the "install --stateroot" that makes it work.
The main fix we need here is preserving existing deployments when we detect we're booted via ostree.
So actually a way we could test this is via our tmt tests instead.
But in the end again I'm good to merge as is. Since I wrote this PR I can't approve it, you (or someone else) needs to do so.
So actually a way we could test this is via our tmt tests instead.
But in the end again I'm good to merge as is.
Branching this to a separate issue #847
Up until now I've tested the PR on a random RHEL AI (bootc) system I had lying around, and it worked, but now I'm trying it on Silverblue and it fails. This seems to be due to the fact that / (/target inside the bootc container) on silverblue, despite being a btrfs rw mount, cannot be written to, so our rw remount is futile
Particularly the action that fails is fchmod but it's only because it's the first write operation we perform on the target
silverblue, despite being a btrfs rw mount, cannot be written to
What error do we get? -EROFS? What does findmnt look like when our container is executed? I wonder if we're messing up a remount?
silverblue, despite being a btrfs rw mount, cannot be written to
What error do we get?
-EROFS? What doesfindmntlook like when our container is executed? I wonder if we're messing up a remount?
Reusing extant ostree layout
DEBUG Loaded SELinux policy: b06a119416deb696a80dbb67248b66d3829780c9e72d376a5089097ace6de5f8
DEBUG Target . is a mountpoint, remounting rw
DEBUG Labeling .
TRACE Label for . is Some("system_u:object_r:root_t:s0")
ERROR Installing to filesystem: Creating ostree deployment: fchmod: Operation not permitted (os error 1)
findmnt inside the container:
bash-5.1# findmnt -J -v --output=SOURCE,TARGET,MAJ:MIN,FSTYPE,OPTIONS,UUID --mountpoint /target
{
"filesystems": [
{
"source": "/dev/vda3",
"target": "/target",
"maj:min": "0:32",
"fstype": "btrfs",
"options": "rw,relatime,seclabel,compress=zstd:1,discard=async,space_cache=v2,subvolid=258,subvol=/root",
"uuid": "41f3f4ae-1242-4ef1-bcd8-fd8706f51738"
}
]
}
ERROR Installing to filesystem: Creating ostree deployment: fchmod: Operation not permitted (os error 1)
Is there maybe a selinux denial here? Failing on fchmod is really odd.
Also hmm we must be missing some .context() here...are we failing on ensure_dir_labeled()? I bet that's it...but it'd be good to know for sure
@omertuc I think we need to get more of your PRs merged, you're doing good work! This one specifically I guess I'm uncertain if we should block on the silverblue-reinstall case; I suspect it's btrfs specific.
But I may look at it today...
@omertuc I think we need to get more of your PRs merged, you're doing good work! This one specifically I guess I'm uncertain if we should block on the silverblue-reinstall case; I suspect it's btrfs specific.
But I may look at it today...
Yeah I was downloading Fedora CoreOS (as opposed to Silverblue) today to see if we hit similar issues, I'll report with what I find
OK yeah there's been a ton of changes in the main branch since this PR was last rebased, it's a conflict-fest. I took an attempt at it and pushed my update to https://github.com/cgwalters/bootc/tree/install-existing-ostree2
One big conflict was around how the install phase got split up, and threading through has_ostree between those parts would have been annoying.
I did https://github.com/containers/bootc/pull/872 instead which is just dead code to start, but moves the "we detected an extant repo" into the "big bag of state" we already had in RootSetup.
stop stop stop :laughing: you're working on your old stale branch, I've already solved all these conflicts a while back
Forced pushed now to solve some new tiny conflict on import lines, other than that there's no conflicts vs main
Forced push again because of duplicate import
I suspect it's btrfs specific.
Yeah I was downloading Fedora CoreOS (as opposed to Silverblue) today to see if we hit similar issues, I'll report with what I find
I just tried Fedora CoreOS and I can confirm the same issue happens even with xfs, so it's not a btrfs issue