mt76 icon indicating copy to clipboard operation
mt76 copied to clipboard

mt7915e causing kernel OOPS + panic

Open jpsollie opened this issue 8 months ago • 2 comments

I do not have much info (yet), I'll try to narrow down as much as possible until it is clear what happens: mt76 causes OOPS on a regular basis on my bananapi R4 board with a mt7915e adapter. Strangely enough, it's always that adapter, the (also connected) mt7916 never causes a panic. the pstore output ( cat /sys/fs/pstore/dmesg-ramoops-*) is here:

Panic#1 Part1
<3>[42894.490010] mt7915e 0002:01:00.0: Message 00005aed (seq 4) timeout
<2>[42894.496228] SError Interrupt on CPU1, code 0x00000000bf000002 -- SError
<7>[42894.496236] CPU: 1 PID: 7754 Comm: kworker/u8:4 Tainted: G           O       6.6.32 #0
<7>[42894.496245] Hardware name: Bananapi BPI-R4 (DT)
<7>[42894.496249] Workqueue: phy1 mt7915_mac_work [mt7915e]
<7>[42894.496285] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
<7>[42894.496293] pc : mt76_mmio_wr+0x3c/0x9c [mt76]
<7>[42894.496316] lr : mt76_mmio_rmw+0x3c/0x64 [mt76]
<7>[42894.496333] sp : ffffffc085f23c10
<7>[42894.496336] x29: ffffffc085f23c10 x28: ffffff80c6f76000 x27: ffffff80c699a680
<7>[42894.496346] x26: 0000000083101000 x25: 0000000000000000 x24: ffffffc079a1c843
<7>[42894.496355] x23: 00000000000001b8 x22: 00000000000001b8 x21: 0000000083100000
<7>[42894.496364] x20: ffffff80c6f72000 x19: 000000008310e802 x18: 00000000000003b7
<7>[42894.496372] x17: 6974202934207165 x16: 7328206465613530 x15: ffffffc080d32810
<7>[42894.496381] x14: 0000000000000b25 x13: 00000000000003b7 x12: 00000000ffffffea
<7>[42894.496389] x11: 00000000ffffefff x10: ffffffc080d8a810 x9 : ffffffc080d327b8
<7>[42894.496397] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000000c
<7>[42894.496405] x5 : ffffffc079a1e71c x4 : ffffffc079c4438c x3 : ffffff80c6f72000
<7>[42894.496413] x2 : 000000008310e802 x1 : 00000000000001b8 x0 : ffffffc081800000
<0>[42894.496422] Kernel panic - not syncing: Asynchronous SError Interrupt
<2>[42894.496425] SMP: stopping secondary CPUs
<0>[42894.496432] Kernel Offset: disabled
<0>[42894.496434] CPU features: 0x0,00000010,20000000,1000400b
<0>[42894.496440] Memory Limit: none
Oops#2 Part1
<7>[42894.496355] x23: 00000000000001b8 x22: 00000000000001b8 x21: 0000000083100000
<7>[42894.496364] x20: ffffff80c6f72000 x19: 000000008310e802 x18: 00000000000003b7
<7>[42894.496372] x17: 6974202934207165 x16: 7328206465613530 x15: ffffffc080d32810
<7>[42894.496381] x14: 0000000000000b25 x13: 00000000000003b7 x12: 00000000ffffffea
<7>[42894.496389] x11: 00000000ffffefff x10: ffffffc080d8a810 x9 : ffffffc080d327b8
<7>[42894.496397] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000000c
<7>[42894.496405] x5 : ffffffc079a1e71c x4 : ffffffc079c4438c x3 : ffffff80c6f72000
<7>[42894.496413] x2 : 000000008310e802 x1 : 00000000000001b8 x0 : ffffffc081800000
<0>[42894.496422] Kernel panic - not syncing: Asynchronous SError Interrupt
<2>[42894.496425] SMP: stopping secondary CPUs
<0>[42894.496432] Kernel Offset: disabled
<0>[42894.496434] CPU features: 0x0,00000010,20000000,1000400b
<0>[42894.496440] Memory Limit: none
<3>[42894.511650] pstore: backend (ramoops) writing error (-28)
<1>[42894.542685] Unable to handle kernel read from unreadable memory at virtual address 0000000000000044
<1>[42894.542690] Mem abort info:
<1>[42894.542691]   ESR = 0x0000000096000005
<1>[42894.542694]   EC = 0x25: DABT (current EL), IL = 32 bits
<1>[42894.542698]   SET = 0, FnV = 0
<1>[42894.542701]   EA = 0, S1PTW = 0
<1>[42894.542703]   FSC = 0x05: level 1 translation fault
<1>[42894.542707] Data abort info:
<1>[42894.542708]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
<1>[42894.542711]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
<1>[42894.542715]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[42894.542719] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000105dbc000
<1>[42894.542724] [0000000000000044] pgd=0800000105dea003, p4d=0800000105dea003, pud=0800000105dea003, pmd=0000000000000000
<0>[42894.542736] Internal error: Oops: 0000000096000005 [#1] SMP
<7>[42894.542742] Modules linked in: netconsole ath9k(O) ath9k_common(O) iptable_nat ath9k_hw(O) ath11k_pci(O) ath11k(O) ath10k_pci(O) ath10k_core(O) ath(O) xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack mt7915e(O) mt76x2e(O) mt76x2_common(O) mt76x02_lib(O) mt7603e(O) mt76_connac_lib(O) mt76(O) mmc_spi mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) ath9k_pci_owl_loader(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables usbnet usblp usbhid uinput tls spidev spi_gpio spi_bitbang sfp rtc_pcf8563 rfcomm r8169 qrtr_mhi qrtr qmi_helpers(O) of_mmc_spi nlmon nfnetlink nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mhi_net mhi mdio_netlink(O) mdio_i2c mdio_gpio mdio_bitbang jc42 hidp hid_mcp2221 hid_generic hid_cp2112 hci_uart gpio_74x164 crc7 crc_itu_t compat(O) cls_flower btusb btrtl btmtk btintel bnep bluetooth atlantic at25 at24 act_vlan 8250_pci crypto_safexcel fuse cls_bpf act_bpf sch_tbf
<7>[42894.542939]  sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sg hid evdev gpio_fan drivetemp i2c_tiny_usb i2c_gpio i2c_smbus industrialio i2c_algo_pcf i2c_algo_pca i2c_algo_bit gpio_pcf857x gpio_pca953x i2c_mux_reg i2c_mux_pca954x i2c_mux_pca9541 i2c_mux_gpio i2c_mux sp805_wdt ledtrig_usbport ledtrig_oneshot cryptodev(O) nfsv4 nfsd nfs ifb rpcsec_gss_krb5 auth_rpcgss oid_registry tun lockd sunrpc grace dns_resolver nls_utf8 nls_iso8859_1 nls_cp437 rfkill eeprom_93cx6 macsec bfq xts xcbc crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_arm64 sha1_ce sha1_generic seqiv rmd160 pcbc michael_mic md5 echainiv geniv des_generic libdes cts chacha20poly1305 cbc authencesn authenc arc4 uas usb_storage sdhci_pltfm sdhci leds_ws2812b(O) leds_gca230718(O) gpio_keys_polled gpio_keys pf_ring(O) leds_tlc591xx leds_pca963x leds_pca955x leds_lp5562 leds_lp55xx_common leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd input_leds
<7>[42894.543136]  input_core fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd ubootenv_nvram(O) vfat fat btrfs xor xor_neon raid6_pq libcrc32c dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax mux_gpio usbcore ptp aquantia pps_core mii tpm encrypted_keys trusted [last unloaded: netconsole]
<7>[42894.543196] CPU: 1 PID: 7754 Comm: kworker/u8:4 Tainted: G           O       6.6.32 #0
<7>[42894.543204] Hardware name: Bananapi BPI-R4 (DT)
<7>[42894.543207] Workqueue: phy1 mt7915_mac_work [mt7915e]
<7>[42894.543235] pstate: 604001c5 (nZCv dAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
<7>[42894.543243] pc : mtk_poll_controller+0x110/0x2a0
<7>[42894.543255] lr : mtk_poll_controller+0x104/0x2a0
<7>[42894.543261] sp : ffffffc085f236b0
<7>[42894.543263] x29: ffffffc085f236b0 x28: ffffff80c8dec818 x27: ffffffc080e87468
<7>[42894.543273] x26: 00000000000000c8 x25: ffffffc080cf6008 x24: ffffff80c2a42e88
<7>[42894.543281] x23: 0000000000000000 x22: ffffff80c2a3e0a4 x21: ffffff80c2a3e0a8
<7>[42894.543289] x20: ffffff80c2a42000 x19: ffffff80c2a3e080 x18: 0000000000000005
<7>[42894.543298] x17: 2d205449442d204f x16: 43542d204f41552d x15: 204e41502b206669
<7>[42894.543306] x14: 61642076637a4e28 x13: 0a292d2d3d455059 x12: 0000000000000000
<7>[42894.543314] x11: 936e757c63a26e70 x10: 14f61337dc6f22d3 x9 : 22092a0de53bbf16
<7>[42894.543322] x8 : ffffff80cc3ed038 x7 : 00000000fffffffb x6 : ffffff80cc3ecfc0
<7>[42894.543330] x5 : ffffff80c2a42e98 x4 : 0000000000000000 x3 : 0000000000000000
<7>[42894.543338] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000000
<7>[42894.543345] Call trace:
<7>[42894.543348]  mtk_poll_controller+0x110/0x2a0
<7>[42894.543355]  netpoll_poll_dev+0xc8/0x238
<7>[42894.543363]  __netpoll_send_skb+0x188/0x254
<7>[42894.543369]  netpoll_send_udp+0x250/0x3e0
<7>[42894.543374]  write_msg+0x124/0x15c [netconsole]
<7>[42894.543389]  console_flush_all+0x198/0x4dc
<7>[42894.543398]  console_flush_on_panic+0x30/0xb8
<7>[42894.543407]  panic+0x15c/0x30c
<7>[42894.543415]  nmi_panic+0x68/0x6c
<7>[42894.543421]  arm64_serror_panic+0x68/0x78
<7>[42894.543426]  do_serror+0x24/0x60
<7>[42894.543430]  el1h_64_error_handler+0x2c/0x40
<7>[42894.543441]  el1h_64_error+0x68/0x6c
<7>[42894.543446]  mt76_mmio_wr+0x3c/0x9c [mt76]
<7>[42894.543465]  __mt7915_reg_remap_addr+0xbc/0x1a8 [mt7915e]
<7>[42894.543487]  mt7915_rr+0xac/0xf4 [mt7915e]
<7>[42894.543508]  mt7915_update_channel+0xd8/0x1a0 [mt7915e]
<7>[42894.543528]  mt76_update_survey+0x2c/0x110 [mt76]
<7>[42894.543546]  mt7915_mac_work+0x2c/0x130 [mt7915e]
<7>[42894.543567]  process_one_work+0x158/0x368
<7>[42894.543577]  worker_thread+0x2a8/0x484
<7>[42894.543584]  kthread+0xdc/0xe8
<7>[42894.543591]  ret_from_fork+0x10/0x20
<0>[42894.543601] Code: 94030184 36000400 f90023f9 f948de80 (b9404417) 
<4>[42894.543605] ---[ end trace 0000000000000000 ]---
Panic#3 Part1
<7>[42894.496381] x14: 0000000000000b25 x13: 00000000000003b7 x12: 00000000ffffffea
<7>[42894.496389] x11: 00000000ffffefff x10: ffffffc080d8a810 x9 : ffffffc080d327b8
<7>[42894.496397] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 000000000000000c
<7>[42894.496405] x5 : ffffffc079a1e71c x4 : ffffffc079c4438c x3 : ffffff80c6f72000
<7>[42894.496413] x2 : 000000008310e802 x1 : 00000000000001b8 x0 : ffffffc081800000
<0>[42894.496422] Kernel panic - not syncing: Asynchronous SError Interrupt
<2>[42894.496425] SMP: stopping secondary CPUs
<0>[42894.496432] Kernel Offset: disabled
<0>[42894.496434] CPU features: 0x0,00000010,20000000,1000400b
<0>[42894.496440] Memory Limit: none
<3>[42894.511650] pstore: backend (ramoops) writing error (-28)
<1>[42894.542685] Unable to handle kernel read from unreadable memory at virtual address 0000000000000044
<1>[42894.542690] Mem abort info:
<1>[42894.542691]   ESR = 0x0000000096000005
<1>[42894.542694]   EC = 0x25: DABT (current EL), IL = 32 bits
<1>[42894.542698]   SET = 0, FnV = 0
<1>[42894.542701]   EA = 0, S1PTW = 0
<1>[42894.542703]   FSC = 0x05: level 1 translation fault
<1>[42894.542707] Data abort info:
<1>[42894.542708]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
<1>[42894.542711]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
<1>[42894.542715]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[42894.542719] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000105dbc000
<1>[42894.542724] [0000000000000044] pgd=0800000105dea003, p4d=0800000105dea003, pud=0800000105dea003, pmd=0000000000000000
<0>[42894.542736] Internal error: Oops: 0000000096000005 [#1] SMP
<7>[42894.542742] Modules linked in: netconsole ath9k(O) ath9k_common(O) iptable_nat ath9k_hw(O) ath11k_pci(O) ath11k(O) ath10k_pci(O) ath10k_core(O) ath(O) xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack mt7915e(O) mt76x2e(O) mt76x2_common(O) mt76x02_lib(O) mt7603e(O) mt76_connac_lib(O) mt76(O) mmc_spi mac80211(O) iptable_mangle iptable_filter ipt_REJECT ip_tables cfg80211(O) ath9k_pci_owl_loader(O) xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG x_tables usbnet usblp usbhid uinput tls spidev spi_gpio spi_bitbang sfp rtc_pcf8563 rfcomm r8169 qrtr_mhi qrtr qmi_helpers(O) of_mmc_spi nlmon nfnetlink nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mhi_net mhi mdio_netlink(O) mdio_i2c mdio_gpio mdio_bitbang jc42 hidp hid_mcp2221 hid_generic hid_cp2112 hci_uart gpio_74x164 crc7 crc_itu_t compat(O) cls_flower btusb btrtl btmtk btintel bnep bluetooth atlantic at25 at24 act_vlan 8250_pci crypto_safexcel fuse cls_bpf act_bpf sch_tbf
<7>[42894.542939]  sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact sg hid evdev gpio_fan drivetemp i2c_tiny_usb i2c_gpio i2c_smbus industrialio i2c_algo_pcf i2c_algo_pca i2c_algo_bit gpio_pcf857x gpio_pca953x i2c_mux_reg i2c_mux_pca954x i2c_mux_pca9541 i2c_mux_gpio i2c_mux sp805_wdt ledtrig_usbport ledtrig_oneshot cryptodev(O) nfsv4 nfsd nfs ifb rpcsec_gss_krb5 auth_rpcgss oid_registry tun lockd sunrpc grace dns_resolver nls_utf8 nls_iso8859_1 nls_cp437 rfkill eeprom_93cx6 macsec bfq xts xcbc crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha512_arm64 sha1_ce sha1_generic seqiv rmd160 pcbc michael_mic md5 echainiv geniv des_generic libdes cts chacha20poly1305 cbc authencesn authenc arc4 uas usb_storage sdhci_pltfm sdhci leds_ws2812b(O) leds_gca230718(O) gpio_keys_polled gpio_keys pf_ring(O) leds_tlc591xx leds_pca963x leds_pca955x leds_lp5562 leds_lp55xx_common leds_gpio xhci_plat_hcd xhci_pci xhci_mtk_hcd xhci_hcd input_leds
<7>[42894.543136]  input_core fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd ubootenv_nvram(O) vfat fat btrfs xor xor_neon raid6_pq libcrc32c dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax mux_gpio usbcore ptp aquantia pps_core mii tpm encrypted_keys trusted [last unloaded: netconsole]
<7>[42894.543196] CPU: 1 PID: 7754 Comm: kworker/u8:4 Tainted: G           O       6.6.32 #0
<7>[42894.543204] Hardware name: Bananapi BPI-R4 (DT)
<7>[42894.543207] Workqueue: phy1 mt7915_mac_work [mt7915e]
<7>[42894.543235] pstate: 604001c5 (nZCv dAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
<7>[42894.543243] pc : mtk_poll_controller+0x110/0x2a0
<7>[42894.543255] lr : mtk_poll_controller+0x104/0x2a0
<7>[42894.543261] sp : ffffffc085f236b0
<7>[42894.543263] x29: ffffffc085f236b0 x28: ffffff80c8dec818 x27: ffffffc080e87468
<7>[42894.543273] x26: 00000000000000c8 x25: ffffffc080cf6008 x24: ffffff80c2a42e88
<7>[42894.543281] x23: 0000000000000000 x22: ffffff80c2a3e0a4 x21: ffffff80c2a3e0a8
<7>[42894.543289] x20: ffffff80c2a42000 x19: ffffff80c2a3e080 x18: 0000000000000005
<7>[42894.543298] x17: 2d205449442d204f x16: 43542d204f41552d x15: 204e41502b206669
<7>[42894.543306] x14: 61642076637a4e28 x13: 0a292d2d3d455059 x12: 0000000000000000
<7>[42894.543314] x11: 936e757c63a26e70 x10: 14f61337dc6f22d3 x9 : 22092a0de53bbf16
<7>[42894.543322] x8 : ffffff80cc3ed038 x7 : 00000000fffffffb x6 : ffffff80cc3ecfc0
<7>[42894.543330] x5 : ffffff80c2a42e98 x4 : 0000000000000000 x3 : 0000000000000000
<7>[42894.543338] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000000
<7>[42894.543345] Call trace:
<7>[42894.543348]  mtk_poll_controller+0x110/0x2a0
<7>[42894.543355]  netpoll_poll_dev+0xc8/0x238
<7>[42894.543363]  __netpoll_send_skb+0x188/0x254
<7>[42894.543369]  netpoll_send_udp+0x250/0x3e0
<7>[42894.543374]  write_msg+0x124/0x15c [netconsole]
<7>[42894.543389]  console_flush_all+0x198/0x4dc
<7>[42894.543398]  console_flush_on_panic+0x30/0xb8
<7>[42894.543407]  panic+0x15c/0x30c
<7>[42894.543415]  nmi_panic+0x68/0x6c
<7>[42894.543421]  arm64_serror_panic+0x68/0x78
<7>[42894.543426]  do_serror+0x24/0x60
<7>[42894.543430]  el1h_64_error_handler+0x2c/0x40
<7>[42894.543441]  el1h_64_error+0x68/0x6c
<7>[42894.543446]  mt76_mmio_wr+0x3c/0x9c [mt76]
<7>[42894.543465]  __mt7915_reg_remap_addr+0xbc/0x1a8 [mt7915e]
<7>[42894.543487]  mt7915_rr+0xac/0xf4 [mt7915e]
<7>[42894.543508]  mt7915_update_channel+0xd8/0x1a0 [mt7915e]
<7>[42894.543528]  mt76_update_survey+0x2c/0x110 [mt76]
<7>[42894.543546]  mt7915_mac_work+0x2c/0x130 [mt7915e]
<7>[42894.543567]  process_one_work+0x158/0x368
<7>[42894.543577]  worker_thread+0x2a8/0x484
<7>[42894.543584]  kthread+0xdc/0xe8
<7>[42894.543591]  ret_from_fork+0x10/0x20
<0>[42894.543601] Code: 94030184 36000400 f90023f9 f948de80 (b9404417) 
<4>[42894.543605] ---[ end trace 0000000000000000 ]---
<0>[42894.557908] Kernel panic - not syncing: Oops: Fatal exception in interrupt
<0>[42894.557911] Kernel Offset: disabled
<0>[42894.557913] CPU features: 0x0,00000010,20000000,1000400b
<0>[42894.557917] Memory Limit: none

some build details:

root@OpenWrt:/# modinfo mt76
filename:       /lib/modules/6.6.32/mt76.ko
license:        Dual BSD/GPL
depends:        mac80211,cfg80211
name:           mt76
vermagic:       6.6.32 SMP mod_unload aarch64

and the build version:

# cat package/kernel/mt76/Makefile | head
include $(TOPDIR)/rules.mk

PKG_NAME:=mt76
PKG_RELEASE=1

PKG_LICENSE:=GPLv2
PKG_LICENSE_FILES:=

PKG_SOURCE_URL:=https://github.com/openwrt/mt76
PKG_SOURCE_PROTO:=git
PKG_SOURCE_DATE:=2024-05-17
PKG_SOURCE_VERSION:=513c131c6309712a51502870b041f45b4bd6a6d4
PKG_MIRROR_HASH:=3e5d8ee6b8b122cc4e32668fdde0552a9fa23819b7ebdc758ecb63b5f761683a

any tips on how to narrow down this issue would be appreciated, I can't even predict what causes the error right now :s. I know it happens often, the system never gets > 1 day uptime, but I do not know when. I only know it always seems to be triggered by the mt7915, not mt7916

jpsollie avatar Jun 05 '24 06:06 jpsollie