RPi4 icon indicating copy to clipboard operation
RPi4 copied to clipboard

Boot fails when CONFIG_EFI_DISABLE_PCI_DMA=y security option is enabled in linux kernel.

Open wizeman opened this issue 3 years ago • 4 comments

When booting a RPi 4 (8G RAM model) with RPi4 UEFI firmware v1.28 (either with the 3G RAM limit enabled or disabled) + EFI GRUB 2.06-rc1 + Linux kernel version 5.10.54 with CONFIG_EFI_DISABLE_PCI_DMA enabled, GRUB seems to fail loading the kernel with the following errors:

EFI stub: Booting Linux kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
EFI stub: ERROR: Exit boot services failed

The above messages appear and disappear very quickly, being very easy to miss unless you're paying a lot of attention, not to mention that they're basically impossible to read as-is. I had to record a video and watch it frame by frame to be able to read them properly.

And then GRUB says:

Failed to boot both default and fallback entries.

And then GRUB goes back to the main menu screen.

After searching online, it was suggested that the EFI stub error message is related to the CONFIG_EFI_DISABLE_PCI_DMA=y kernel config option.

Specifically, it was suggested that some buggy UEFI firmwares didn't work when that config option was enabled.

To check if indeed that was the problem, it was suggested to boot with the kernel command line parameter efi=no_disable_early_pci_dma.

And indeed, when I boot with the above kernel parameter, GRUB loads the kernel without any issues.

It would be good to fix the above issue because it's not obvious that the boot failure is related to that config option (especially considering that the relevant error message passes by so quickly as to almost being imperceptible).

wizeman avatar Jul 31 '21 15:07 wizeman

That sorta makes sense. Presumably one of the uefi drivers (probably XHCI) is failing to shutdown cleanly if DMA has been disabled under it, likely because its command ring exists in main memory and requires DMA to operate. That error then gets propagated to exit boot services. Then of course the efi stub drops back to grub, which can't read anything because it doesn't know that the interface has been disabled.

I'm sorta on the fence about whether this is a "buggy uefi firmware" and would have to go back and spend some time looking at the spec to decide one way or the other as well as whether the efi stub in the kernel is performing the handover correctly. The kernel/stub feature was added to close a security loophole where a device could potentially DMA to unprotected main memory following exit boot services. This is sorta nonsense on the rpi4 because it doesn't have an (documented to exist?) IOMMU. And my 30 second reading of the situation is that the HW/Firmware is buggy if its leaving a device active that could DMA following exit boot services, particularly since something like a GPU appears to be left active anyway.

So, I've sorta got to ask, which linux distro/tree had this turned on?

jlinton avatar Aug 27 '21 04:08 jlinton

So, I've sorta got to ask, which linux distro/tree had this turned on?

Mine didn't, but I've been running all my machines (including laptops, desktops and servers) with this config option manually turned on for more than a year without any issues (except for this one, of course).

wizeman avatar Aug 29 '21 00:08 wizeman

Was this related to the other RPi Kernel XHCI/USB issues from the time period? Is this still an issue with 1.32?

paulwratt avatar Feb 18 '22 00:02 paulwratt

Is this still an issue with 1.32?

I believe so, recently (after having upgraded to 1.32) I had a bootable USB drive in which I had forgotten to add the efi=no_disable_early_pci_dma boot parameter and it failed to boot as described in this issue. It worked once I added the parameter.

wizeman avatar Feb 18 '22 13:02 wizeman