bootupd
bootupd copied to clipboard
efi: update the ESP by creating a tmpdir and RENAME_EXCHANGE
See Timothée's comment https://github.com/coreos/bootupd/issues/454#issuecomment-2178227050
Reuse TMP_PREFIX, logic is like this:
cp -a fedora .btmp.fedora- We start with a copy to make sure to keep all other files that we do not explicitly track in bootupd
- Update the content of
.btmp.fedorawith the new binaries - Exchange
.btmp.fedora->fedora - Remove now "old"
.btmp.fedora
If we have a file not in a directory in EFI, then we can copy
it to .btmp.foo and then act on it and finally rename it. No
need to copy the entire EFI.
Additional info for the logic:
- for removals, copy to temp dir, remove file in temp dir; if have a file not in a directory, remove it directly
- for changes/additions, copy to temp dir, write files in temp dir; if have a file not in a directory, copy as temp file
- Do local exchange/rename and remove temp dir/file
And use insert() instead of push() to match starts_with()
when scanning temp files & dirs.
Fixes https://github.com/coreos/bootupd/issues/454
filetree: add failpoint when doing exchange Inspired by https://github.com/coreos/bootupd/pull/669#issuecomment-2220760948
[2024-06-20T08:07:06.180Z] # Previous BIOS: grub2-tools-1:2.06-89.fc38.x86_64
[2024-06-20T08:07:06.180Z] # Updated BIOS: grub2-tools-1:2.06-102.fc38.x86_64
[2024-06-20T08:07:06.180Z] # error: Failed to update EFI: No such file or directory (os error 2)
[2024-06-20T08:07:06.180Z] + fatal 'test failed'
[2024-06-20T08:07:06.180Z] + echo error: test failed
[2024-06-20T08:07:06.180Z] error: test failed
🤔
[2024-06-20T12:39:28.464Z] # Previous BIOS: grub2-tools-1:2.06-89.fc38.x86_64
[2024-06-20T12:39:28.464Z] # Updated BIOS: grub2-tools-1:2.06-102.fc38.x86_64
[2024-06-20T12:39:28.464Z] # Previous EFI: grub2-efi-x64-1:2.06-89.fc38.x86_64,shim-x64-15.8-3.x86_64
[2024-06-20T12:39:28.464Z] # Updated EFI: grub2-efi-x64-1:2.06-102.fc38.x86_64,shim-x64-15.8-3.x86_64,test-bootupd-payload-1.0-1.x86_64
[2024-06-20T12:39:28.464Z] # grep: /tmp/tmp.hin3bC02Ci/EFI/fedora/test-bootupd.efi: No such file or directory
[2024-06-20T12:39:28.464Z] # ls: cannot access '/tmp/tmp.hin3bC02Ci/EFI/fedora/test-bootupd.efi': No such file or directory
We've merged the CI fixes. Can you rebase this one? Thanks
We've merged the CI fixes. Can you rebase this one? Thanks
Sure, updated, thanks!
Updating shim-x64 15.8-2 -> 15.8-3, it will also update BOOT/fbx64.efi
[root@cosa-devsh ~]# bootupctl update -vvv
Previous EFI: grub2-efi-x64-1:2.06-103.fc40.x86_64,shim-x64-15.8-2.x86_64
Updated EFI: grub2-efi-x64-1:2.06-123.fc40.x86_64,shim-x64-15.8-3.x86_64
[root@cosa-devsh ~]# mount /dev/vda2 /boot/efi/
[root@cosa-devsh ~]# ls /boot/efi/EFI/BOOT/ -l
total 1014
-rwxr-xr-x. 1 root root 949424 Jan 1 1980 BOOTX64.EFI
-rwxr-xr-x. 1 root root 87816 Jun 21 10:01 fbx64.efi
If using .fedora.tmp will cause CI failed grep: /tmp/tmp.hin3bC02Ci/EFI/fedora/test-bootupd.efi: No such file or directory, guess it is because filetree is based on "BOOT/BOOTX64.EFI" or "fedora/BOOTX64.CSV", so the apply_diff will also create the dir like BOOT and fedora under temp dir, which makes it actually /tmp/tmp.hin3bC02Ci/EFI/fedora/fedora/test-bootupd.efi.
The problem with copying the entire EFI directory is that we don't know what's there, and there might a lot of other vendor directories (windows, ubuntu, etc.) that we don't really want to copy as that might take more space than what we have in the EFI partition.
The problem with copying the entire
EFIdirectory is that we don't know what's there, and there might a lot of other vendor directories (windows,ubuntu, etc.) that we don't really want to copy as that might take more space than what we have in the EFI partition.
Makes sense, we might need to do some change to apply_diff, if we make temp for BOOT and fedora.
Update the change in apply_diff(), not sure if it is the right place, steps are:
- get unique updates dir in
diff, save them in hashset - scan the hashset,
- extract the
diffbased on the same parent - copy original dir to temp dir
- remove original file in temp dir first, then write new and changed files to temp dir
- do local exchange for
fedora.tmptofedora - remove
fedora.tmp
- extract the
That looks ok but the logic feels a little bit difficult to follow. I might have missed something however so maybe my understanding is not correct.
I'm thinking about the following logic for apply_diff:
- First look at removals (to minimize disk space usage)
- Look at which "root/parent" folder the file to remove is
- Does a temporary copy of this folder exists?
- Yes, remove the file in the copy
- No:
- Copy the folder
- Remember that we made a copy of this folder
- Remove the file in the copy
- Repeat the same logic for changes and additions
- Sync changes to the disk once here
- At this point, we haven't yet touched the content actually used to boot
- If there is not enough space to write all the files, we should have failed before this point
- Iterate over the folders we made a copy of and do RENAME_EXCHANGE
- Sync changes between each rename (or once at the end?)
- Remove all the temporary folders
- Sync one last time (optional?)
I'm thinking about the following logic for
apply_diff:
First look at removals (to minimize disk space usage)
Look at which "root/parent" folder the file to remove is
Does a temporary copy of this folder exists?
Yes, remove the file in the copy
No:
- Copy the folder
- Remember that we made a copy of this folder
- Remove the file in the copy
Repeat the same logic for changes and additions
Sync changes to the disk once here
- At this point, we haven't yet touched the content actually used to boot
- If there is not enough space to write all the files, we should have failed before this point
Iterate over the folders we made a copy of and do RENAME_EXCHANGE
Copy the changed files to prepared temp dir, the final result will not be as expected if the change dir is included by another change. For example, if changes are EFI/fedora/subdir/newgrub.x64 and EFI/fedora/grub.x64, the temp dir will be EFI/fedora/subdir.tmp and EFI/fedora.tmp, then do exchange EFI/fedora/subdir.tmp -> EFI/fedora/subdir, the problem is EFI/fedora.tmp -> EFI/fedora will not include the new change in EFI/fedora/subdir, this is the unit test in test_filetree2() that I debug most of the time, so I do the exchange separately based on dir, any other workarounds for this?
It is not problem for the current bootupd(no this patch) as we only update the file, not related to the dir.
- Sync changes between each rename (or once at the end?)
- Remove all the temporary folders
- Sync one last time (optional?)
if changes are EFI/fedora/subdir/newgrub.x64 and EFI/fedora/grub.x64, the temp dir will be EFI/fedora/subdir.tmp and EFI/fedora.tmp
With EFI/fedora/subdir/newgrub.x64 changed, we should copy EFI/fedora to EFI/fedora.tmp. That should cover all the subdirs and should be granular enough.
I.e. we need to find the first subdir in EFI and copy that.
if changes are EFI/fedora/subdir/newgrub.x64 and EFI/fedora/grub.x64, the temp dir will be EFI/fedora/subdir.tmp and EFI/fedora.tmp
With
EFI/fedora/subdir/newgrub.x64changed, we should copyEFI/fedoratoEFI/fedora.tmp. That should cover all the subdirs and should be granular enough.I.e. we need to find the first subdir in
EFIand copy that.
This is complicated as we do not know which comes first, we just scan them in diff.additions, diff.changes, diff.removals.
If the parent already in temp dir, then no need to create sub temp dir; if no, will copy to temp dir; if temp sub dir exists, then need to rename the changed temp sub dir first, to keep temp dir to updated.
This will make the update slow as we do a lot of checking, WDYT?
I'm not sure I understand what you mean by "which comes first". I think the idea is more to think of it as the first subdir in the EFI folder instead of thinking about it as "parent" directory.
If I have the following update/addition/removal for example:
EFI/fedora/subdir/fooEFI/fedora/subdir/bar/totoEFI/fedora/baz
Then the logic would be:
- For the file
EFI/fedora/subdir/fooas part of the diff:- Look if
fedora.tmpexists (the first subdir ofEFI) - Copy
fedoratofedora.tmpand remember it - Update/remove/add
EFI/fedora.tmp/subdir/foo
- Look if
- For the file
EFI/fedora/subdir/bar/totoas part of the diff:- Look if
fedora.tmpexists - I does, so nothing to copy
- Update/remove/add
EFI/fedora/subdir/bar/toto
- Look if
- Repeat for
EFI/fedora/baz - Once done for all files in the diff, do the rename for the folders that we remembered we copied:
- Rename exchange
fedora.tmp->fedora, etc.
- Rename exchange
I'm not sure I understand what you mean by "which comes first"
Sorry for the confusion,
if the changes are EFI/fedora/subdir/newgrub.x64, EFI/fedora/grub.x64,
copy to EFI/fedora/subdir.tmp first, then comes EFI/fedora.tmp, need to remove EFI/fedora/subdir.tmp first, just copy to EFI/fedora.tmp.
if the changes are EFI/fedora/grub.x64, EFI/fedora/subdir/newgrub.x64,
copy to EFI/fedora.tmp first, then comes EFI/fedora/subdir.tmp, check parent EFI/fedora.tmp exists, just using EFI/fedora.tmp.
For the file EFI/fedora/subdir/foo as part of the diff: Look if fedora.tmp exists (the first subdir of EFI)
If the change is like EFI/foo, the temp dir should be EFI.tmp.
If I have the following update/addition/removal for example:
EFI/fedora/subdir/foo EFI/fedora/subdir/bar/toto EFI/fedora/baz
Maybe should scan update/addition/removal in diff to get parent dir, like {EFI/fedora/subdir, EFI/fedora/subdir/bar, EFI/fedora}, then check them, and get only one EFI/fedora, then do copy to temp EFI/fedora.tmp?
Thanks for working on this! It's a really important feature. I hope to get some time by the end of this week or early next week to look at this.
Sorry for the confusion, if the changes are
EFI/fedora/subdir/newgrub.x64, EFI/fedora/grub.x64, copy toEFI/fedora/subdir.tmpfirst, then comesEFI/fedora.tmp, need to removeEFI/fedora/subdir.tmpfirst, just copy toEFI/fedora.tmp.
If we have EFI/fedora/subdir/newgrub.x64, EFI/fedora/grub.x64, we only need to copy at the fedora level, not at any other sub-levels.
For the file EFI/fedora/subdir/foo as part of the diff: Look if fedora.tmp exists (the first subdir of EFI)
If the change is like
EFI/foo, the temp dir should beEFI.tmp.
If we have a file not in a directory in EFI, then we can copy it to foo.tmp and then act on it and finally rename it. No need to copy the entire EFI.
The goal is to make the operation atomic at the vendor (i.e. fedora, BOOT, etc.) directory level, not at the EFI or any other sub-directory level.
(I've not looked at the code again yet)
make a test to verify the case where the update fails in the middle, right between the two rename exchange calls.
We need to make sure that we can properly resume an update if it fails in the middle, when half has already been exchanged and the other half still needs to be exchanged (BOOT.tmp -> BOOT exchange already done, fedora.tmp -> fedora exchange still pending). It would be incredibly unlikely to happen in practice but this is the point of this PR thus it would be great to have that.
That might need a special "fail" mode flag for testing however so might not be desirable.
make a test to verify the case where the update fails in the middle, right between the two rename exchange calls.
We need to make sure that we can properly resume an update if it fails in the middle, when half has already been exchanged and the other half still needs to be exchanged (
BOOT.tmp->BOOTexchange already done,fedora.tmp->fedoraexchange still pending)
This is likely to happen, for example, the power can be outage at anytime during update.
Not sure about the resume, if running bootupctl update again, apply_diff() will firstly call cleanup_tmp() to remove all temp dirs/files, and then do the sync. I can manually do the testing for this firstly. Not sure about the auto test.
Edit: Not easy to do manually testing, as the update only uses about 1.5s.
# time bootupctl update
Running as unit: bootupd.service
Previous BIOS: grub2-tools-1:2.06-103.fc40.x86_64
Updated BIOS: grub2-tools-1:2.06-123.fc40.x86_64
Previous EFI: grub2-efi-x64-1:2.06-103.fc40.x86_64,shim-x64-15.8-2.x86_64
Updated EFI: grub2-efi-x64-1:2.06-123.fc40.x86_64,shim-x64-15.8-3.x86_64
real 0m1.421s
user 0m0.005s
sys 0m0.010s
Edit: Not easy to do manually testing, as the update only uses about 1.5s.
We could do https://github.com/coreos/rpm-ostree/pull/4259/commits/aa8d7fb0ceaabfaf10252180e2ddee049d07aae3 in this repo too
I think my high level feelings on this are basically:
- I liked the original idea generally
- It feels 40% less useful if we can't do transactional updates for shim (to rephrase: it's still useful, just less so)
- This is complex and subtle code and problem domain and while bootupd has tests (and thanks for adding new unit tests I just remain in an uncertain feeling due to the above combined
do transactional updates for shim
Is it something like https://github.com/coreos/bootupd/issues/440?
~The error in CI is wired, can reproduce locally using quay.io/centos/centos:stream9~
enabling baseos-source repository
enabling appstream-source repository
enabling extras-common-source repository
CentOS Stream 9 - BaseOS - Source 496 B/s | 3.0 kB 00:06
Errors during downloading metadata for repository 'baseos-source':
- Downloading successful, but checksum doesn't match. Calculated: e13bd479141ed656a70be1867b0c7258625f469b13baaf70011319723361358415da1f8417f92cf37eaf5dd4bb83dddae4ae7e52fddbc4c6dfb0ac3af2c3233a(sha512) Expected: c435af47f3197ce3267e5244050b6b1ef1ee668c91cee288d83c887471d5c239b1c82cfec43c283624ba25d40b6eb01e56768a54bd7ffa4b47a5e21fb62365b3(sha512)
Error: Failed to download metadata for repo 'baseos-source': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
Error: building at STEP "RUN dnf -y install dnf-utils zstd && dnf builddep -y rust-bootupd": while running runtime: exit status 1
- It feels 40% less useful if we can't do transactional updates for shim (to rephrase: it's still useful, just less so)
Well, it's transactional, but not "across" folders. I find it unlikely that shim and grub would end up needing to be updated at the same time for them to properly work. I've asked input from the bootloader team but did not get any so far.
So far, it's the best that we've got without touching the EFI variables.
- It feels 40% less useful if we can't do transactional updates for shim (to rephrase: it's still useful, just less so)
Well, it's transactional, but not "across" folders. I find it unlikely that shim and grub would end up needing to be updated at the same time for them to properly work.
So far, it's the best that we've got without touching the EFI variables.
With this patch, it is just to update files in another way (via temp dir). Before enable auto-updates, I think it should be more safer if include https://github.com/coreos/bootupd/issues/440
Anything else that I should do before merged? CC @cgwalters @travier , thanks!