firmware icon indicating copy to clipboard operation
firmware copied to clipboard

Regression booting rpi3/4 with U-Boot on recent firmware

Open nullr0ute opened this issue 4 years ago • 46 comments

Describe the bug Moving Fedora from firmware 7208c3d to 25e2b59 without any other changes has regressed booting with U-Boot on Fedora.

It stops the RPi3 from booting at all and the RPi4 boots but extremely slowly. The F-34 U-Boot for RPi-3 and RPi-4

To reproduce

U-Boot 2021.04 (Apr 21 2021 - 00:00:00 +0000)

DRAM:  992 MiB
RPI 3 Model B+ (0xa020d3)
MMC:   sdhci_setup_cfg: Hardware doesn't specify base clock frequency
sdhci_setup_cfg: Hardware doesn't specify base clock frequency
mmcnr@7e300000 - probe failed: -22
mmc@7e202000: 0sdhci_setup_cfg: Hardware doesn't specify base clock frequency

Loading Environment from FAT... fsm 1, hsts 00000080
read timeout error - HSTS 00000080
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:382/bcm2835_prepare_data()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
unable to select a mode
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: USB DWC2
scanning bus usb@7e980000 for devices... 4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0 
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
fsm 1, hsts 00000080
read timeout error - HSTS 00000080
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:382/bcm2835_prepare_data()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
WARNING at drivers/mmc/bcm2835_sdhost.c:414/bcm2835_send_command()!
unable to select a mode
no mmc device at slot 1

Device 0: unknown device
lan78xx_eth Waiting for PHY auto negotiation to complete...... done
BOOTP broadcast 1
DHCP client bound to address 192.168.0.39 (4 ms)
Using lan78xx_eth device
TFTP from server 192.168.0.20; our IP address is 192.168.0.39

Expected behaviour No "sdhci_setup_cfg: Hardware doesn't specify base clock frequency" errors and timeouts reading from the MMC

Actual behaviour Timeouts on the mSD slot

System Vanilla F-34 with just the RPi firmware updated.

  • Which model of Raspberry Pi? RPi3b/RPi3B+/RPi4
  • Which OS and version (cat /etc/rpi-issue)? Fedora release 34 (Thirty Four)
  • Which firmware version (vcgencmd version)? 2021-08-19T12:27:51 - 25e2b59
  • Which kernel version (uname -a)? 5.13.12-200.fc34.aarch64

Logs Included above

Additional context

Looks most like the following but I will be checking all the in between builds:

firmware: arm_loader: Make most arm clock requests required
See: https://github.com/raspberrypi/firmware/issues/1598

nullr0ute avatar Sep 01 '21 21:09 nullr0ute

Confirming this issue... SD card throughput plummets with recent firmware.

From u-boot 2021.07 with rpi_arm64_defconfig:

fatload mmc 0:1 ${kernel_addr_r} <image>

Currently:

5259877 bytes read in 44908 ms (114.3 KiB/s)
   Uncompressing Kernel Image

Previously:

5259877 bytes read in 226 ms (22.2 MiB/s)
   Uncompressing Kernel Image

the current 1.20210831 tag is affected, 1.20210805 is not.

25e2b597ebfb2495eab4816a276758dcc6ea21f1 seems to introduce the issue. 69cfd9f03a2b4aab6aec021328be9355d1dad1db is not affected.

Works:

root@rpi-87fa00:~# vcdbg version
vcos_build_version     = 046501ae7f6bb2bb270f609d764be02a5fb2cf65 (clean)
vcos_build_branch      = bcm2711_2
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_application'
vcos_build_application = ?
vcos_build_date        = Aug 18 2021
vcos_build_time        = 12:56:16
vcos_build_user        = dom
vcos_build_hostname    = buildbot
vcos_build_platform    = raspberrypi_linux
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_type'
vcos_build_type        = ?

affected:

root@rpi-87fa00:~# vcdbg
vcdbg - no command specified
root@rpi-87fa00:~# vcdbg version
vcos_build_version     = ef2c018dccdeb94b0376db62a2ea4c882f9b500d (clean)
vcos_build_branch      = bcm2711_2
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_application'
vcos_build_application = ?
vcos_build_date        = Aug 19 2021
vcos_build_time        = 12:28:18
vcos_build_user        = dom
vcos_build_hostname    = buildbot
vcos_build_platform    = raspberrypi_linux
debug_sym: ReadVideoCoreMemoryBySymbol: Symbol not found: 'vcos_build_type'
vcos_build_type        = ?

vfazio avatar Sep 02 '21 20:09 vfazio

This is hitting the latest edk2 builds as well. I debugged it a bit in uboot last week and the mailbox is returning a 0 clock rate. I need to see if the mailbox structure changed/whatever but I'm away from the pi's today.

jlinton avatar Sep 06 '21 19:09 jlinton

Any further information you can provide about the clock in question and the circumstances surrounding the 0 rate would be appreciated.

pelwell avatar Sep 07 '21 09:09 pelwell

@pelwell

I put some print statements in u-boot:

MMC:   bcm2835: clk: 4 clkstate: 1
MMC:   bcm2835: clk: 4 clkrate: 0
mmc@7e202000: 0
Loading Environment from FAT... slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
OK
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: usb dr_mode not found
USB DWC2
scanning bus usb@7e980000 for devices...
Warning: smsc95xx_eth (eth0) using random MAC address - 52:40:40:f3:94:cb
4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0
U-Boot> mmcinfo
Device: mmc@7e202000
Manufacturer ID: 3
OEM: 5344
Name: SC32G
Bus Speed: 0
Mode: SD High Speed (50MHz)
Rd Block Len: 512
SD version 3.0
High Capacity: Yes
Capacity: 29.7 GiB
Bus Width: 4-bit
Erase Group Size: 512 Bytes

on previous firmware:

MMC:   bcm2835: clk: 4 clkstate: 1
MMC:   bcm2835: clk: 4 clkrate: 250000000
mmc@7e202000: 0
Loading Environment from FAT... slow clock: set cdiv: 2047
slow clock: set cdiv: 2047
sdhost: mmclock: 122129
sdhost: set cdiv: 2045
sdhost: mmclock: 50000000
sdhost: set cdiv: 3
OK
In:    serial
Out:   vidconsole
Err:   vidconsole
Net:   No ethernet found.
starting USB...
Bus usb@7e980000: usb dr_mode not found
USB DWC2
scanning bus usb@7e980000 for devices...
Warning: smsc95xx_eth (eth0) using random MAC address - 16:d1:56:63:74:ef
4 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
Hit any key to stop autoboot:  0
U-Boot> mmc info
Device: mmc@7e202000
Manufacturer ID: 3
OEM: 5344
Name: SC32G
Bus Speed: 50000000
Mode: SD High Speed (50MHz)
Rd Block Len: 512
SD version 3.0
High Capacity: Yes
Capacity: 29.7 GiB
Bus Width: 4-bit
Erase Group Size: 512 Bytes

So on newer firmware bcm2835_get_mmc_clock https://source.denx.de/u-boot/u-boot/-/blob/v2021.07/arch/arm/mach-bcm283x/msg.c#L74

is returning 0 for clock BCM2835_MBOX_CLOCK_ID_CORE (4)

vfazio avatar Sep 07 '21 18:09 vfazio

Ok, reproduced it fully with a rpi4-8G model in edk2.

When it works (with a newer sd card)

... RpiFirmwareSetClockState: Request clock state C = 1
RpiFirmwareSetClockState: Request clock state 1 = 1 ... RpiFirmwareGetClockRate: Get Clock Rate: Tag=30002 ClockId=C
RpiFirmwareGetClockRate: Get Clock Rate (MailboxBuf=F5A5F000):
20 00 00 00 00 00 00 00
02 00 03 00 08 00 00 00
00 00 00 00 0C 00 00 00
00 00 00 00 00 00 00 00
RpiFirmwareGetClockRate: Get Clock Rate Done:
20 00 00 00 00 00 00 80
02 00 03 00 08 00 00 00
08 00 00 80 0C 00 00 00
00 E1 F5 05 00 00 00 00
RpiFirmwareGetClockRate: Get Clock Rate return: ClockRate=100000000

But that last bit when it fails:

RpiFirmwareGetClockRate: Get Clock Rate: Tag=30002 ClockId=C RpiFirmwareGetClockRate: Get Clock Rate (MailboxBuf=F5A5F000): 20 00 00 00 00 00 00 00 02 00 03 00 08 00 00 00 00 00 00 00 0C 00 00 00 00 00 00 00 00 00 00 00 RpiFirmwareGetClockRate: Get Clock Rate Done: 20 00 00 00 00 00 00 80 02 00 03 00 08 00 00 00 08 00 00 80 0C 00 00 00 00 00 00 00 00 00 00 00 RpiFirmwareGetClockRate: Get Clock Rate return: ClockRate=0

The fedora defect: https://bugzilla.redhat.com/show_bug.cgi?id=1999180

Notes that uboot is splatting:

sdhci_setup_cfg: Hardware doesn't specify base clock frequency

which happens the same way (the mailbox is returning that the clock rate is 0)

but it also notes that its dependent on SD card, which seems to be the case since I spent an hour or so thinking I was crazy because I couldn't get it to trigger today.

Further, its not apparently some kind of powersave/whatever going on because If I issue a request to set the clockrate to 100mhz on the emmc2 right before requesting the current state it will accept the request, but then immediately return that the clockrate is 0.

jlinton avatar Sep 07 '21 22:09 jlinton

Please note, that I am seeing the same behaviour here with a RaspberryPi2 and the latest 1.20210831 firmware. As soon as I use that version rather than the previous 1.20210805 one the SD card performance within u-boot drops significantly (from ~ 23MB/s to 120KB/s). So it seems that this is not limited to a Pi3 or Pi4 but also affects the older Pi2 as well.

jens-maus avatar Sep 08 '21 19:09 jens-maus

Is there any further information needed to help with this issue?

vfazio avatar Sep 10 '21 14:09 vfazio

@popcornmix This seems to be a direct consequence of making GET_CLOCK_RATE specify the REQUIRED flag for all but CLOCK_ARM.

pelwell avatar Sep 10 '21 15:09 pelwell

GET_CLOCK_RATE has always been problematic, as the firmware can change the clocks for a variety of reasons. e.g. core clock changes after initial_turbo has expired, during hdmi mode sets, when temperature is high, and during expensive gpu operations. This caused an issue with the linux kernel's use, as sometimes it queried the clock and skipped setting it. That caused hard to reproduce errors when the clock just happened to temporarily match what was wanted, so the request was skipped, and when the firmware removed it's clock requirement the clock no longer matched what was wanted by linux.

The fix for this was to make GET_CLOCK_RATE just return the last value requested by SET_CLOCK_RATE. The actual clock chosen is the max of all requests (i.e. what the arm has requested, plus requests from the firmware).

If uboot is just using this call to get a fixed clock without setting it, it will now get zero returned. Note: that this was always unsafe as on different boots you could end up with a different value returned.

It may be better for uboot to call GET_MIN_CLOCK. That should return the value you were previously usually getting before the firmware update. However it will now not be subject to changes based on temperature, initial_turbo, hdmi requirements etc.

The other option is to use GET_CLOCK_MEASURED, which will return the actual value of a clock, but this will include temporary changes based on what firmware is doing, and as it's actually measured, the values have a little jitter on them.

popcornmix avatar Sep 10 '21 16:09 popcornmix

@popcornmix thanks

I thought @jlinton mentioned that setting the clock and then retrieving it still returned 0... I can try playing around with uboot to see what happens with:

  1. setting the rate then retrieving it
  2. using GET_MIN_CLOCK
  3. using GET_CLOCK_MEASURED

vfazio avatar Sep 10 '21 16:09 vfazio

I'd expect setting clock and retrieving it to give the value you set. Let me now if you can confirm that.

popcornmix avatar Sep 10 '21 17:09 popcornmix

MIN clock, or MAX clock? I think in both uboot/edk2 the returned clock rate is being used to select the smallest possible divisor for the given card (aka run it as fast as possible). I don't think either of those project is really interested in a dynamic clock value, but rather its peak. If the actual clock rate is lower, AFAIK SD cards should be able to deal with that. The problem is if it ends up going over the card max.

jlinton avatar Sep 10 '21 18:09 jlinton

MAX clock may be safer if you don't want a divided down value to exceed a specced value.

popcornmix avatar Sep 10 '21 18:09 popcornmix

I can confirm that, within u-boot at least, if i issue SET_CLOCK_RATE, i receive the requested rate in the response to the request and in a subsequent GET_CLOCK_RATE message

U-Boot 2021.07 (Sep 10 2021 - 19:03:20 +0000)

DRAM:  948 MiB
RPI Compute Module 3+ (0xa02100)
MMC:   
bcm2835: set_clock_rate response 4 250000000
bcm2835: get_clock_rate response 4 250000000
bcm2835: get_max_clock_rate response 4 400000000

Whereas without the SET_CLOCK_RATE call:

MMC:   
bcm2835: get_clock_rate response 4 0
bcm2835: get_max_clock_rate response 4 400000000

uboot will boot fine with either GET_MIN_CLOCK_RATE or GET_MAX_CLOCK_RATE, though it reads from the sd card faster with GET_MIN_CLOCK_RATE

vfazio avatar Sep 10 '21 20:09 vfazio

What happens if MAX is read, and then set? Does it actually adjust the clock rate near the max? I would imagine that would be the safe method of maintaining the higher speeds? (I will probably take another pass with edk2 later this evening)

jlinton avatar Sep 10 '21 20:09 jlinton

@jlinton i'll test that in a bit. I was trying to divine what's being done in the Linux driver, the logic is a little different than what's being done in uboot

https://github.com/raspberrypi/linux/blob/1.20210831/drivers/mmc/host/bcm2835-sdhost.c#L2127 https://github.com/raspberrypi/linux/blob/1.20210831/drivers/mmc/host/bcm2835-sdhost.c#L1558

vfazio avatar Sep 10 '21 20:09 vfazio

Setting the core clock (4) to the max maintains the higher speeds

MMC:   
bcm2835: get_clock_rate response 4 0
bcm2835: get_max_clock_rate response 4 400000000
bcm2835: set_clock_rate response 4 400000000
mmc@7e202000: 0
...
U-Boot> boot
5259877 bytes read in 226 ms (22.2 MiB/s)
   Uncompressing Kernel Image

though i'm a little wary of changing the core clock to max (core is currently being queried due to this note https://github.com/raspberrypi/firmware/issues/953#issuecomment-373790925)

vfazio avatar Sep 10 '21 21:09 vfazio

So, i'm looking at the code now, and I now remember the dance edk2+ACPI is playing with Linux. (and I think this is true of DT too for the EMMC2). Linux, in the case of the EMMC2, is depending on the base clock rate to actually reflect what is in the SDHCI base clock frequency field (which is 100Mhz). In the case of the older EMMC which doesn't provide a meaningful cap's register, we are using the generic linux sdhci-caps/mask property to override the field to 250mhz.

So, we can play this dance and change the freq but we need to override the sdhci base frequency fields being fed into Linux. Leading me to think, I'm just going to validate that the requested frequency is between min->max and request 100 (or 250) mhz.

But, I'm not sure if I'm doing something wrong (I don't think so, we can override the core clock/etc) but I've tried changing the emmc2 clock (ClockId=C) with both the skip turbo and !skip turbo set, and I read it immediately back and its still zero, but only with the card that's failing. Otherwise, everything is fine (newer cards). The card that consistently fails for me is an old patriot 8GB SDHC class 4 card (SD2.0). This is following a 1.5Ghz request of the core clock, which is responding that the clock rate is what we requested.

Further, if I ignore the clock=0 return, assume the clock is what we requested, everything seem to work fine, including Linux.

jlinton avatar Sep 10 '21 23:09 jlinton

@popcornmix This seems to be a direct consequence of making GET_CLOCK_RATE specify the REQUIRED flag for all but CLOCK_ARM.

I assumed you were talking about something internal to the vc firmware since I don't see that here: https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface#get-clock-rate or is there a change not documented there?

jlinton avatar Sep 10 '21 23:09 jlinton

@vfazio With a supporting firmware (which they all will have been since the Pi 3 was launched) the sdhost driver delegates the SD clock management to the firmware, relying on it to adjust the divisor as the core clock changes.

@jlinton Yes, I was referring to a change in the firmware code that implements GET_CLOCK_RATE.

pelwell avatar Sep 11 '21 08:09 pelwell

Just wanted to add that we are seeing the same results reported above with our bare metal kernels.

Any firmware version from commit 25e2b59 always returns 0 for GET_CLOCK_RATE requests for the following clocks:

CLOCK_ID_EMMC (1) CLOCK_ID_UART (2) CLOCK_ID_EMMC2 (0xC)

regardless of whether SET_CLOCK_RATE is called prior or not.

These all return correct values for GET_MIN_CLOCK and GET_MAX_CLOCK and all behave correctly with any earlier firmware version.

Other clocks such as CLOCK_ID_CORE (4) return 0 prior to calling SET_CLOCK_RATE but return a correct value for GET_CLOCK_RATE afterwards.

Commit 57e7fd1 in March 2021 changed the response for CLOCK_ID_CORE (4) and others, at that point we added code to call SET_CLOCK_RATE for some of these so we could calculate correct divisors for I2C, SPI and SDHOST.

Clearly the sequence of calling SET_CLOCK_RATE and passing the value from GET_MAX_CLOCK will also be acceptable for EMMC, UART and EMMC2 but at present it simply doesn't work.

ultibohub avatar Sep 12 '21 07:09 ultibohub

I can confirm that, within u-boot at least, if i issue SET_CLOCK_RATE, i receive the requested rate in the response to the request and in a subsequent GET_CLOCK_RATE message

Did you have a patch published somewhere to use as a testing base?

nullr0ute avatar Sep 13 '21 13:09 nullr0ute

@nullr0ute I'll push something soonish. There may be two patches, one to implement the logic that's being used in Linux re letting the firmware manage the clock, and another likely to have some safeguard behavior if GET_CLOCK_RATE is 0 to make a call to GET_MIN_CLOCK and SET_CLOCK_RATE and to use the clock value in the response

I need to test that second patch w/o the first patch to make sure it works as intended.

vfazio avatar Sep 13 '21 14:09 vfazio

@nullr0ute This is based on the 2021.07 release: https://github.com/vfazio/u-boot/commits/vfazio-rpi-mmc-clock-fix but they'll probably apply cleanly to 2021.04.

I'll try to submit the series to upstream tomorrow once i'm comfortable with the commit message wording.

vfazio avatar Sep 13 '21 22:09 vfazio

I did push these to U-Boot's patchwork https://patchwork.ozlabs.org/project/uboot/list/?series=262375 we'll see if they get included in the october release..

vfazio avatar Sep 15 '21 14:09 vfazio

I'm just going to echo the last sentence of ultibohub's comment in case it was missed.

"Clearly the sequence of calling SET_CLOCK_RATE and passing the value from GET_MAX_CLOCK will also be acceptable for EMMC, UART and EMMC2 but at present it simply doesn't work."

I would also like clarification that SET_CLOCK_RATE will act as a clamp and the firmware won't exceed the requested value, as well as it will try and honor that value unless there is a thermal/power problem.

Thanks,

jlinton avatar Sep 22 '21 18:09 jlinton

I would also like clarification that SET_CLOCK_RATE will act as a clamp and the firmware won't exceed the requested value, as well as it will try and honor that value unless there is a thermal/power problem.

The clock will be set to the maximum of all requests. For example the firmware may have requested a higher value than the arm for core_clock (e.g. while changing hdmi mode), as not doing so will cause a failure.

Also there may be other constraints, such as clocks which share a PLL.

You'd probably have to give an example of the clock and value set, and whether the outcome was as expected.

popcornmix avatar Sep 22 '21 18:09 popcornmix

The clock will be set to the maximum of all requests. For example the firmware may have requested a higher value than the arm for core_clock (e.g. while changing hdmi mode), as not doing so will cause a failure.

@popcornmix what is the expected behavior of GET_CLOCK_RATE at this point?

You'd probably have to give an example of the clock and value set, and whether the outcome was as expected.

I've just retested with commit f9bc224 and for CLOCK_ID_EMMC (1), CLOCK_ID_UART (2) and CLOCK_ID_EMMC2 (0xC) the return from GET_CLOCK_RATE is always 0 regardless of what value was set with SET_CLOCK_RATE.

Other clocks such as CORE and CPU still return the value that was set.

We've added an additional check so that if GET_CLOCK_RATE returns 0 we call GET_CLOCK_MEASURED instead but should we assume that GET_CLOCK_RATE will no longer function in a meaningful way and abandon using it completely?

We are aware of other projects seeing the same results as well.

ultibohub avatar Sep 23 '21 00:09 ultibohub

@popcornmix what is the expected behavior of GET_CLOCK_RATE at this point?

It returns the value requested by the arm.

I've just retested with commit f9bc224 and for CLOCK_ID_EMMC (1), CLOCK_ID_UART (2) and CLOCK_ID_EMMC2 (0xC) the return from GET_CLOCK_RATE is always 0 regardless of what value was set with SET_CLOCK_RATE.

let me check what that is doing.

popcornmix avatar Sep 23 '21 12:09 popcornmix

let me check what that is doing.

I can reproduce. For those clocks it's not behaving as designed. I'll have a think about best way to fix it.

popcornmix avatar Sep 23 '21 16:09 popcornmix