firmware icon indicating copy to clipboard operation
firmware copied to clipboard

bootcode.bin: Initrd loading seems to be broken

Open flipreverse opened this issue 2 years ago • 24 comments

Describe the bug bootcode.bin is responsible for loading the kernel image as well as the initrd. This, however, does not happen consistently. If the boot uart is enabled, it works. Otherwise, not.

To reproduce Create an initrd mkinitramfs -o /boot/init.gz for my overlayfs setup. Enable it in /boot/config.txt: initramfs init.gz Ensure boot uart is disabled.

Expected behaviour The initrd is loaded, and my overlayfs is set up properly.

Actual behaviour The kernel fails to load the initrd. See dmesg output below.

System

  • Which model of Raspberry Pi? e.g. Pi3B+, PiZeroW Raspberry Pi 2 Model B Rev 1.1
  • Which OS and version (cat /etc/rpi-issue)? Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 66255495f29be5d09b765d081aff6fc0f11e59b4, stage2
  • Which firmware version (vcgencmd version)?
Aug 26 2022 14:05:32 
Copyright (c) 2012 Broadcom
version 102f1e848393c2112206fadffaaf86db04e98326 (clean) (release) (start_x)
  • Which kernel version (uname -a)? Linux hyperion 5.15.68-v7+ #1587 SMP Tue Sep 20 11:15:23 BST 2022 armv7l GNU/Linux

Logs

[  +0.000757] Trying to unpack rootfs image as initramfs...
[  +0.000262] rootfs image is not initramfs (invalid magic at start of compressed archive); looks like an initrd

Additional context If I, however, enable the boot uart as described here, everything works fine. The initrd is loaded, and my overlayfs works. I successfully repeated this procedure multiple times, when writing this post.

flipreverse avatar Sep 22 '22 20:09 flipreverse

bootcode.bin only loads start.elf and fixup.dat. The OS and initrd are loaded by start.elf What's the latest known good version of start.elf that works for you (vcgencmd version)?

UART logging adds some delays. If you revert the BOOT_UART change and just add uart_2ndstage=1 to config.txt does that have the same effect?

timg236 avatar Sep 23 '22 08:09 timg236

Adding uart_2ndstage=1to /boot/config.txt seems to have the same effect. Which files do I have to replace when moving backwards? Just start_x.elf and fixup.dat?

flipreverse avatar Sep 23 '22 12:09 flipreverse

Which files do I have to replace when moving backwards? Just start_x.elf and fixup.dat?

Make sure they are matching (e.g. start_x.elf and fixup_x.dat) and they are the ones you are using (start_x.elf only used if start_x=1 is in config.txt). But should be easy to test if you are doing the right thing (it will boot and vcgencmd version will change output to match).

popcornmix avatar Sep 23 '22 12:09 popcornmix

I think I narrowed it down to these commits: Last working commit in this repo is: 329af8a59d91ea023ce3e2566e2ccd4ec0624438

Jun 15 2022 20:08:33
Copyright (c) 2012 Broadcom
version ede5f7d365ef42ebbd4aa144ee5b51ea75604c82 (clean) (release) (start_x)

Next commit df569e043fe498afac2506ed45765d58aa80c408 seems to broken:

Jul  4 2022 14:42:50
Copyright (c) 2012 Broadcom
version 01db5c05be1b3594c8f040ed7c1ac2748890c2c8 (clean) (release) (start_x)

flipreverse avatar Sep 23 '22 14:09 flipreverse

Thanks - that's really helpful. Unfortunately for us that short list of commits consists of 3 that are clocking-related, and 5 which look as though they should be harmless.

pelwell avatar Sep 23 '22 14:09 pelwell

Yikes. If you have any new firmware binaries, feel free to contact me. :)

flipreverse avatar Sep 23 '22 14:09 flipreverse

There is a trial firmware, with just the 3 clock-related patches reverted, available to download here: https://drive.google.com/file/d/1g-F3KC2Kp14MSrJR7gYPPCQ_eK43TyEv/view?usp=sharing It boots for me on a 2B, but I've not tested it beyond that.

pelwell avatar Sep 23 '22 15:09 pelwell

I previously used the _x files. Does this matter? If so, the google drive offers wrong files.

flipreverse avatar Sep 23 '22 18:09 flipreverse

Remove start_x=1 from config.txt and use the files start.elf/fixup.dat.

popcornmix avatar Sep 23 '22 18:09 popcornmix

Just to be sure. vcgencmd version reports:

Sep 23 2022 15:58:05 
Copyright (c) 2012 Broadcom
version d3dbb0500a7fdd0dec3183a56f25238cb45e19b5 (tainted) (release) (start)

However, that didn't do the trick. :( I still have to add uart_2ndstage=1 to my config.

flipreverse avatar Sep 23 '22 20:09 flipreverse

Let's try again, this time reverting all the firmware commits between those two hashes: https://drive.google.com/file/d/1e0jiW91R8SD7l-rZRaCqmEv4A9utZ64e/view?usp=sharing This time I've supplied the x variants, so you should be able to restore start_x=1.

pelwell avatar Sep 26 '22 15:09 pelwell

Sep 26 2022 15:34:42 
Copyright (c) 2012 Broadcom
version ad2c2750f3ec6fd0aa84bb6d931d9f793a84b7c8 (tainted) (release) (start_x)
[all]
initramfs init.gz
#uart_2ndstage=1
start_x=1

This one works.

flipreverse avatar Sep 26 '22 15:09 flipreverse

That's a relief - I don't like Heisenbugs. If I can't see the problem after reading the remaining commits, I may have to resort to a binary chop.

pelwell avatar Sep 26 '22 15:09 pelwell

Next one, this time with just one specific commit reverted: https://drive.google.com/file/d/1KX5VtdS5ItX5_HPHDyObhR-4k8iNefvY/view?usp=sharing

pelwell avatar Sep 26 '22 16:09 pelwell

Could you also tell me if you have any cameras attached, and if so, which ones?

pelwell avatar Sep 26 '22 16:09 pelwell

Could you also tell me if you have any cameras attached, and if so, which ones?

Yes, I do have one attached. I think it's this one: 'Raspberry Pi Camera Module 2 NoIR'.

It works with:

Sep 26 2022 17:09:28 
Copyright (c) 2012 Broadcom
version 4bcce7924dc536394dec4b54d5fc2fae196d2394 (tainted) (release) (start_x)

flipreverse avatar Sep 26 '22 17:09 flipreverse

Ooh, interesting. Sorry, @naushir - that's a top-of-tree firmware + a reversion of 89af8e130aa635d8c31c0c3394461a7605858260.

pelwell avatar Sep 26 '22 19:09 pelwell

Oops! I have no idea how that change could cause this issue. I'll look into it....

naushir avatar Sep 27 '22 07:09 naushir

@flipreverse can you tell me the size of your init.gz file? Additionally, if you unplug your camera, do you still get the failure on boot?

naushir avatar Sep 28 '22 12:09 naushir

@flipreverse can you tell me the size of your init.gz file? Additionally, if you unplug your camera, do you still get the failure on boot?

The init.gz is about 8.3MB.

Yes, it works without activating the secondary stage uart if I unplug my camera. Firmware version:

Aug 26 2022 14:05:32 
Copyright (c) 2012 Broadcom
version 102f1e848393c2112206fadffaaf86db04e98326 (clean) (release) (start_x)

flipreverse avatar Sep 28 '22 16:09 flipreverse

@flipreverse there is an rpi-update firmware with a possible fix for this.

popcornmix avatar Sep 30 '22 13:09 popcornmix

@flipreverse there is an rpi-update firmware with a possible fix for this.

Sep 30 2022 14:28:00 
Copyright (c) 2012 Broadcom
version afe2b3f5f60315137568e903508e7eae8b6543a4 (clean) (release) (start_x)

This version looks good.

Out of curiosity: What was the cause of this bug?

flipreverse avatar Oct 03 '22 17:10 flipreverse

@naushir provided the fix, so I'm not certain. The issue seemed to start with putting camera detection into a separate gpu thread. The fix was a partial revert of that.

popcornmix avatar Oct 03 '22 17:10 popcornmix

Yes, it reverted the change that moved the auto-detection routine into a separate thread.

naushir avatar Oct 04 '22 13:10 naushir

Fix was released in https://github.com/raspberrypi/firmware/commit/2b3cef2f4e9987ab4ad5e07578c2f6192aa7787d

timg236 avatar Nov 28 '22 15:11 timg236