rpi-eeprom
rpi-eeprom copied to clipboard
RPi 5 bootloader doesn't load kernel with GPT partition array not at LBA 2
Describe the bug
Home Assistant OS started using Genimage for creating OS images in release 12.4. After that change we started to get reports that freshly imaged SD card fails to boot on Raspberry Pi 5 (https://github.com/home-assistant/operating-system/issues/3437). The change that was introduced by using Genimage was that the first LBA field in the GPT header was changed from 34 (which is typically what sgdisk sets) to 2048 (which is what's actually more appropriate with 1MiB partition alignment) - something that looked quite harmless.
This worked perfectly fine when the image was flashed to the card from Linux, however, it failed to boot when Raspberry Pi Imager was used on Windows. Turned out the Imager itself isn't to blame, but the Windows kernel (presumably on Windows 10+, Windows 7 don't show that behavior), which alters the partition table even when a drive is just plugged into the computer. The change it does is that it moves the backup LBA to the real end of the drive, and if the first LBA isn't 34, it also moves the partition array start to a different block (seems to be first_LBA - 32). In the second case, the actual array of partition entries still exists on both places - at LBA 2 and LBA 2016 (i.e. the old data from the image is not nulled).
The problem is that Raspberry Pi bootloader can't boot if the partition entries are relocated. It doesn't seem that it makes assumptions the partition table is at LBA 2 - because it is still there, so maybe something is miscalculated or read is attempted outside of expected boundaries? Anyway, although what Windows do is quite a nasty thing, the GPT table is still perfectly valid, so I'd expect it shouldn't cause trouble.
Here's the diff of the part of the GPT table before and after it's modified by Windows - note the change in the last 8 bytes:
-00000210: 7e39 180f 0000 0000 0100 0000 0000 0000 ~9..............
+00000210: cf3d ab66 0000 0000 0100 0000 0000 0000 .=.f............
| crc32 | reserved|current LBA |
-00000220: ffff 3f00 0000 0000 0008 0000 0000 0000 ..?.............
+00000220: ffff ba03 0000 0000 0008 0000 0000 0000 ................
| backup LBA | first LBA |
-00000230: deff 3f00 0000 0000 a211 3e2d a149 fa44 ..?.......>-.I.D
+00000230: deff ba03 0000 0000 a211 3e2d a149 fa44 ..........>-.I.D
| last LBA | GUID 0-7B |
-00000240: a850 55b6 af53 4f62 0200 0000 0000 0000 .PU..SOb........
+00000240: a850 55b6 af53 4f62 e007 0000 0000 0000 .PU..SOb........
| GUID 8-15B | partition array |
I can confirm that the partition entries start at LBA 2016 (0x7e0 = byte offset 0xfc000) after that change:
-000fc000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
+000fc000: 2873 2ac1 1ff8 d211 ba4b 00a0 c93e c93b (s*......K...>.;
For completeness, this is what Windows do with the partition table if the first LBA is set to 34:
-00000210: 99f9 2492 0000 0000 0100 0000 0000 0000 ..$.............
+00000210: deb0 923f 0000 0000 0100 0000 0000 0000 ...?............
| crc32 | reserved|current LBA |
-00000220: ffff 3f00 0000 0000 2200 0000 0000 0000 ..?.....".......
+00000220: ffff ba03 0000 0000 2200 0000 0000 0000 ........".......
| backup LBA | first LBA |
-00000230: deff 3f00 0000 0000 faec 4e7f 7d9a 0f4b ..?.......N.}..K
+00000230: deff ba03 0000 0000 faec 4e7f 7d9a 0f4b ..........N.}..K
| GUID 8-15B | partition array |
00000240: 8bbc 2d37 8b85 a2f0 0200 0000 0000 0000 ..-7............
Steps to reproduce the behaviour
Install Home Assistant OS 12.4 image (e.g. using Raspberry Pi Imager) on Windows.
- alternatively -
Run system image with GUID partition table with array of partition entries starting at LBA != 2.
Device (s)
Raspberry Pi 5
Bootloader configuration.
Bootloader versions up to latest 6fe0b091 2024/06/05.
System
No response
Bootloader logs
USB boot
No response
NVMe boot
No response
Network (TFTP boot)
No response