Firmware icon indicating copy to clipboard operation
Firmware copied to clipboard

A2101 - Crash running "ls" in grub shell after removing a USB stick

Open xry111 opened this issue 3 years ago • 7 comments

When I booting the system with a USB stick plugged in, the boot failed because (hd0) changed from the HDD to the USB stick. It's not an issue: such behavior is very common among all motherboards. But when I removed the USB stick and tried to verify with "ls" in grub shell:

UsbRemoveDevice: device 1 removed
UsbEnumeratePort: device disconnected event on port 1

grub> ls
(proc) (hd0) 
SystemContext.SystemContextLoongArch address 0xFDE0BEB4
CsrCrmd   0xB0
CsrPrmd   0x4
CsrEctl  0x800
CsrEstat   0x480000
CsrEpc    0xF9E27B4C
CsrBadv    0xAFAFAFAFAFAFAFBF
CsrBadi 0x2400132B
Shut down slave cores done!

xry111 avatar Jul 02 '22 04:07 xry111

Yes, it was a bug about GRUB, the device was removed but descriptor was not update in the GRUB code.

kilaterlee avatar Jul 12 '22 07:07 kilaterlee

cc @yetist for grub port insights

xen0n avatar Jul 12 '22 07:07 xen0n

cc @yetist for grub port insights

Indeed, we found the real evidence in GRUB code that it not update or remove the descriptor timely manner.

kilaterlee avatar Jul 12 '22 08:07 kilaterlee

Is this a grub bug? Is it related to LoongArch?

yetist avatar Jul 13 '22 02:07 yetist

Is this a grub bug? Is it related to LoongArch?

With GRUB 2.06 for x86_64 EFI (hd0) is also not removed in ls output, but it does not crash.

To me any application (not only GRUB) should print an error message when some unexpected error is hit (even if it's not recoverable), instead of just crash.

xry111 avatar Jul 13 '22 02:07 xry111

Is this a grub bug? Is it related to LoongArch?

With GRUB 2.06 for x86_64 EFI (hd0) is also not removed in ls output, but it does not crash.

To me any application (not only GRUB) should print an error message when some unexpected error is hit (even if it's not recoverable), instead of just crash.

Yes, it is not related to LoongArch. Please pay attention for this address: 0xAFAFAFAFAFAFAFAF, EDK2 will write this flag when the resouce be released, GRUB uses the release method of EDK2. GRUB code is not friendly for such operations, like quickly unplugging USB device and then accessing them. I guess it is not crash in X86 platform, might be they can handle addresses that like 0xAFAFAFxxxxx, but in LoongArch, the address is not accessible, it will be trigger TLB exceptions. Please @zwaizwai add more details. Thanks!

kilaterlee avatar Jul 13 '22 03:07 kilaterlee

"grub> set debug=all" "grub> ls" You can view a more detailed debug info after typing these cmd.

zwaizwai avatar Jul 13 '22 04:07 zwaizwai