linux icon indicating copy to clipboard operation
linux copied to clipboard

Wifi fails under heavy load and requires reboot (mmc1: Timeout waiting for hardware interrupt)

Open LordRaptor opened this issue 4 years ago • 10 comments

Describe the bug I'm running octoprint on my Raspberry with a webcam and connecting in headless mode over wifi. If I watch the webcam stream from octoprint, which puts the wifi under heavy load, after a while (random, can be minutes, can be hours), the wifi cuts out. Nothing but a reboot can bring it back. Currently I'm using the octopi image, but I had the same issue using the official Raspbian image.

To reproduce

  1. Connect to a wifi
  2. Put the wifi under heavy load for an extended period of time
  3. Sometimes the wifi stops working, does not happen every time

Expected behaviour Wifi keeps working

Actual behaviour Wifi stops working, device is unresponsive. Even with a screen and keyboard connected, I was unable to bring the device down (ifdown didn't find the device anymore), ifconfig still listed it but not connected, trying to scan for wifi networks let to a timeout.

System https://pastebin.com/TxW2rkR8

Logs https://pastebin.com/GNxF3A9h

Additional info I have a second MicroSD card I can use to run tests on, if that would help

LordRaptor avatar Aug 25 '21 06:08 LordRaptor

I have the same problem on a Raspberry Pi 400, my WiFi suddenly stops working and i need to restart it.

ifdown didn't find the device anymore

This command didn't work for me either but using ifconfig works fine for me:

sudo ifconfig wlan0 down and then, after some time sudo ifconfig wlan0 up

randomMesh avatar Aug 26 '21 21:08 randomMesh

I'm experiencing exactly the same thing on the CM4, kernel 6.6.31+rpt-rpi-v8.

Logs

After the initial error cm4 kernel: mmc1: Timeout waiting for hardware interrupt the following keeps repeating indefinitely:

Jul 06 21:16:28.363687 cm4 kernel: brcmfmac: brcmf_sdio_rxfail: abort command, terminate frame, send NAK
Jul 06 21:16:29.885265 cm4 kernel: brcmfmac: brcmf_sdio_rxfail: count never zeroed: last 0xffff
Jul 06 21:16:29.885488 cm4 kernel: brcmfmac: brcmf_sdio_readframes: RXHEADER FAILED: -5

Benedolt avatar Jul 07 '24 20:07 Benedolt

This is almost certainly a silly question, but have you tried to disable wifi power save?

/usr/sbin/iw wlan0 set power_save off

seamusdemora avatar Jul 08 '24 05:07 seamusdemora

Same with Pi 3 B+. I have power save turned off.

Jul 07 23:29:34 streetcat kernel: brcmfmac: brcmf_cfg80211_set_power_mgmt: power save disabled
Jul 08 20:57:45 streetcat kernel: mmc1: Timeout waiting for hardware interrupt.
Jul 08 20:57:45 streetcat kernel: brcmfmac: mmc_submit_one: CMD53 sg block write failed -110
Jul 08 20:57:45 streetcat kernel: brcmfmac: brcmf_sdio_txfail: sdio error, abort command and terminate frame
Jul 08 20:57:45 streetcat kernel: brcmfmac: brcmf_sdio_hdparse: seq 127: max tx seq number error
Jul 08 20:57:55 streetcat kernel: mmc1: Timeout waiting for hardware interrupt.
Jul 08 20:57:55 streetcat kernel: brcmfmac: mmc_submit_one: CMD53 sg block write failed -110
Jul 08 20:57:55 streetcat kernel: brcmfmac: brcmf_sdio_txfail: sdio error, abort command and terminate frame
Jul 08 20:58:06 streetcat kernel: mmc1: Timeout waiting for hardware interrupt.
Jul 08 20:58:06 streetcat kernel: brcmfmac: mmc_submit_one: CMD53 sg block write failed -110
Jul 08 20:58:06 streetcat kernel: brcmfmac: brcmf_sdio_txfail: sdio error, abort command and terminate frame
Jul 08 20:58:23 streetcat kernel: brcmfmac: brcmf_sdio_hdparse: HW header checksum error
Jul 08 20:58:23 streetcat kernel: brcmfmac: brcmf_sdio_rxfail: terminate frame
<last 2 lines repeats>

Kernel version 6.1.21-v7+, failed in the middle of scp'ing a large file.

dword1511 avatar Jul 09 '24 08:07 dword1511

This message is key: HW header checksum error It suggests corruption on the SDIO bus, which in turn suggests a lack of power. Does over_voltage=2 in config.txt help?

pelwell avatar Jul 09 '24 08:07 pelwell

Thank you for your insight, @pelwell

Using over_voltage=2 seems to fix the issue for me! Wifi speed is even a tiny bit faster and the CM4 gets a little bit hotter.

Benedolt avatar Jul 11 '24 12:07 Benedolt

over_voltage=1 may also work for you - it depends on how marginal the voltage is on your CM4.

pelwell avatar Jul 11 '24 13:07 pelwell

I haven't gotten a crash with over_voltage=1 yet - test transfer is still running. However, I can already see that the transfer speed is about a third slower. Is there any downside for running with over_voltage=2?

Benedolt avatar Jul 11 '24 14:07 Benedolt

Voltage should not affect speed in that way, so the slow-down might be due to retries due to CRC errors. It sounds like 2 is the correct over_voltage.

pelwell avatar Jul 11 '24 14:07 pelwell

That make total sense! Transfer speed with over_voltage=1 is all over the place for me, it bounces up and down and in the end it's about a third slower. (I don't see CRC errors in dmesg however.) over_voltage=2 is way more stable, I'll stick with that then. Thanks, @pelwell

Benedolt avatar Jul 12 '24 16:07 Benedolt