mkosi v15+: mkosi.extra/boot/ files missing in /boot, breaks incremental update_existing_rootfs()
update_existing_rootfs() currently relies on /boot/System.map-N.M being located on the main partition. When it's not, the "incremental" build fails like this:
not found: ./qbuild/mnt/boot/System.map-6.12.0. Try rebuilding with '-r img'
The -r img workaround is correct but obviously much slower.
Note there are multiple places where the ESP partition can be mounted: notably /efi or /boot. Fedora+mkosi seems to always use /efi by default?
https://wiki.archlinux.org/title/EFI_system_partition#Typical_mount_points
cc:
- #75
I can reproduce as early as mkosi v15. This was likely caused by the v15 switch to systemd-repart, see giant commit
https://github.com/systemd/mkosi/commit/8bbbd836078a2 "Migrate disk image building to systemd-repart"
Because we don't know up-front anymore where the ESP partition will be mounted, all boot loader files are installed to /boot. So to populate an ESP partition, you'd use "CopyFiles=/boot:/" in the partition definition file of the ESP partition.
I can reproduce as early as mkosi v15.
Correction: with mkosi v22, /boot/System.map-6.12.0 and friends land in the ESP partition.
With mkosi v15, they land NOWHERE!
~I think we just need a systemd-repart configuration~. It felt great to avoid an explicit partition table and just rely entirely on mkosi defaults but that's just too "volatile" and unpredictable for something like update_existing_rootfs(). Even if update_existing_rootfs() could get smarter and dynamically adjust its System.map logic now to various partition schemes, it would break again somewhere else or for some other, random mkosi version. So let's just bite the systemd-repart configuration bullet. I took a look and it does not look like rocket science. Also, it's still possible to leave a lot of things as default in such a configuration.
EDIT cc:
- https://github.com/systemd/mkosi/issues/3948
I think we just need a systemd-repart configuration.
... or maybe not. Maybe that's not required after all... Change of mind.
One burning question is: what is the -F System.map argument trying to achieve? It came with the addition of the depmod invocation in commit 2ed0ed3af4fa2f3ec. man depmod says:
-F, --filesyms System.map
Supplied with the System.map produced when the kernel was built, this allows the -e
option to report unresolved symbols. This option is mutually incompatible with -E.
But -e is not currently used! So, -F does nothing at all ?
Also: when invoked by update_existing_rootfs(), setup_depmod() seem to look at the OLD System.map file? This re-enforces the suspicion that it does nothing :-D
Could this -F be another instance of trying to port to mkosi v15+ another update_existing_rootfs() feature that never actually worked with v14- in the first place? Like #76. If yes then let's just (temporarily) delete it to unblock the migration to v15+
Generally speaking, porting to mkosi v15+ is really hard without a clear picture of what: 1) code was supposed to do with mkosi v14- in the first place 2) what it was actually achieving with v14-.
Other complications:
The kernel and the initrd live in potentially 3 different places. Even with a fresh build from scratch, all these have a different initrd file :-(
Status with mkosi v14- and Fedora 40 (v15+ has significant differences)
mkosi.extra/usr/lib/modules/6.12.0-dirty/vmlinuz# used when booting with --direct-kernel = the default optionmkosi.extra/boot/vmlinuz-6.12.0-dirty# yet another duplicate, yeah! Staging for /boot/- ESP partition # usually mounted at
/efi, used when booting with --no-direct-kernel5248fff44e974fce9cc88b89875eb063/6.12.0/linux# usual bzImage. This copy is NOT updated by the update_existing_rootfs() shortcut. Gone or moved with v15+EFI/Linux/linux.efi# copy of the above.systemd-bootdefault. NOT actually a UKI! Not even an .EFI binary! This generates abootctlwarning. Created and updated byupdate_rootfs_boot_kernel(): still there with v15+ (with a slightly different name) and still the systemd-boot default. Fixes and renames submitted in #98EFI/Linux/mkosi-fedora-6.12.efi# all-in-one UKI with initrd included. Unreliable with mkosi v14? Can be just ignored.
/booton the root partition: the usual vmlinuz+initrd with v14- thanks toinstall_build_initrd() / make_install_kernel(); EMPTY with mkosi v15+!! Never used at boot time, only at later modprobe time?vmlinuzdoes get updated by update_existing_rootfs()
The situation with modules is similar but even more varied because in addition to being embedded in initrd files, modules are also in /lib/modules/. Business as usual.
Simply dropping the -F System.map argument is enough to build and boot with mkosi v15 (EDIT: and with many other mkosi versions) https://github.com/pmem/run_qemu/actions/runs/12402741008/job/34624880942?pr=90
@stellarhopper , @weiny2 could you test that -F System.map drop more extensively? I mean with some actual kernel and module changes...
--- a/run_qemu.sh
+++ b/run_qemu.sh
@@ -1037,11 +1037,11 @@ setup_depmod()
fi
if [ ! -f "$system_map" ]; then
echo "not found: $system_map. Try rebuilding with '-r img'"
- return 1
+ # return 1
fi
: Warning: symlinks created by this depmod dont survive the move
: to the virtual machine
- sudo depmod -b "$prefix" -F "$system_map" -C "$depmod_dir" "$kver"
+ sudo depmod -b "$prefix" -C "$depmod_dir" "$kver"
}
I did a lot more testing and dropping "-F System.map" is not good enough. It's just shooting the messenger. It's a "also guilty" messenger but still just a messenger. Dropping "-F System.map" fixes the build but hides a bigger missing /boot problem.
Here's the situation with mkosi v15+ if we drop "-F System.map"
- run_qemu.sh from scratch; invokes
mkosi:/boot/is totally empty - run_qemu.sh not from scratch:
mkosinot used,update_init_rootfs()run instead:/boot/has the latest vmlinuz
The above tested with both v15 and v23.
I think it's better to fail with this "system.map" error message because it can lead people to this bug and issue until the real /boot/ problem is actually fixed rather than silently give them an empty and then mostly empty /boot/ while pretending everything looks fine.
Hm, didn't mean to close this - I guess it auto-closed because of the mention in #98
I guess it auto-closed because of the mention in https://github.com/pmem/run_qemu/pull/98
Most likely yes, please upvote https://github.com/orgs/community/discussions/17308 (and duplicates...)
Dropping "-F System.map" fixes the build but hides a bigger missing
/bootproblem.
So the key question is: does anyone or anything uses /boot?
/boot inside the image is used by neither --direct-kernel nor by --no-direct-kernel right now. The former uses the kernel and initrd outside the image. The latter uses the /efi partition.
Maybe /boot/ was used in older, GRUB times but not anymore now? @stellarhopper, @weiny2 , any memories?
If /boot is not used or not used anymore, then we can drop /boot entirely, point -F System.map somewhere else and the problem should be solved!
@marc-hb yeah I'm pretty sure this is true - /boot is just a holdover from grub days, and likely can be removed now.
update_existing_rootfs() currently relies on /boot/System.map-N.M being located on the main partition. When it's not, the "incremental" build fails like this:... I can reproduce as early as mkosi v15.
I don't understand how this stopped being an issue. Was it Fedora 40 specific? I'm not using Fedora much these days.
EDIT: right now -r img has an empty /boot which is not an issue, while -r img has a non-empty /boot/ which works too. I can't remember when this was failing and why.
That does not mean /boot/ is useful now. Maybe it still isn't. But the build does not fail...