[ESP-WIFI-MESH] Guru Meditation Error found when Fixed root was node not available. (IDFGH-13169)
Answers checklist.
- [X] I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
- [X] I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
- [X] I have searched the issue tracker for a similar issue and not found a similar issue.
General issue report
I encountered an issue that was not resolved after RTC_SW_CPU_RST. The only thing that worked was a hard reset, which meant powering off the device.
I (11013294) mesh: <WIFI_EVENT_SCAN_DONE>status:fail, num:3, id:199
I (11013295) mesh: [IO]enable self-organizing, search parent<adaptive>
Guru Meditation Error: Core 0 panic'ed (Memory protection fault).
Unknown operation at address 0xffffffff not permitted (INVALID_MEM_TYPE)
Core 0 register dump:
PC : 0x4003229f PS : 0x00060a34 A0 : 0x800320ff A1 : 0x3fff50a0
A2 : 0x3ffd5d0c A3 : 0x3fff58d0 A4 : 0x00060a20 A5 : 0x00000000
A6 : 0x00000000 A7 : 0x3fff58c8 A8 : 0x7ffeb19c A9 : 0x00000000
A10 : 0x00000001 A11 : 0xffffffff A12 : 0x00000000 A13 : 0x00000000
A14 : 0x00000001 A15 : 0x3ffdffd8 SAR : 0x0000001d EXCCAUSE: 0x00000007
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x40025314
Backtrace: 0x4003229c:0x3fff50a0 0x400320fc:0x3fff50c0 0x40025b7e:0x3fff50e0 0x400346a1:0x3fff5100 0x400faf5f:0x3fff5120 0x400e9ef9:0x3fff5140 0x400ce221:0x3fff5170 0x400d480a:0x3fff52e0 0x4002d8b6:0x00000000 |<-CORRUPTED
ELF file SHA256: 8765ae40b9c74068
Rebooting...
ESP-ROM:esp32s2-rc4-20191025
Build:Oct 25 2019
rst:0xc (RTC_SW_CPU_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x400254b2
SPIWP:0xee
mode:DIO, clock div:1
load:0x3ffe6110,len:0x186c
load:0x4004b000,len:0x4
load:0x4004b004,len:0xb0c
load:0x4004f000,len:0x3268
entry 0x4004b1cc
[0;32mI (26) boot: ESP-IDF v5.2.1-dirty 2nd stage bootloader[0m
[0;32mI (26) boot: compile time Jun 14 2024 11:58:06[0m
[0;32mI (26) boot: chip revision: v1.0[0m
[0;32mI (30) boot.esp32s2: SPI Speed : 80MHz[0m
[0;32mI (35) boot.esp32s2: SPI Mode : DIO[0m
[0;32mI (40) boot.esp32s2: SPI Flash Size : 4MB[0m
[0;32mI (44) boot: Enabling RNG early entropy source...[0m
[0;32mI (50) boot: Partition Table:[0m
[0;32mI (53) boot: ## Label Usage Type ST Offset Length[0m
[0;32mI (61) boot: 0 nvs WiFi data 01 02 0000d000 00004000[0m
[0;32mI (68) boot: 1 otadata OTA data 01 00 00011000 00002000[0m
[0;32mI (76) boot: 2 phy_init RF data 01 01 00013000 00001000[0m
[0;32mI (83) boot: 3 edf WiFi data 01 02 00014000 00004000[0m
[0;32mI (90) boot: 4 storage Unknown data 01 81 00018000 00020000[0m
[0;32mI (98) boot: 5 ota_0 OTA app 00 10 00040000 00170000[0m
[0;32mI (105) boot: 6 ota_1 OTA app 00 11 001b0000 00170000[0m
[0;32mI (113) boot: End of partition table[0m
[0;32mI (117) esp_image: segment 0: paddr=00040020 vaddr=3f000020 size=4a5dch (304604) map[0m
[0;32mI (187) esp_image: segment 1: paddr=0008a604 vaddr=3ffc9bc0 size=03e34h ( 15924) load[0m
[0;32mI (190) esp_image: segment 2: paddr=0008e440 vaddr=40024000 size=01bd8h ( 7128) load[0m
[0;32mI (194) esp_image: segment 3: paddr=00090020 vaddr=40080020 size=e1c78h (924792) map[0m
[0;32mI (385) esp_image: segment 4: paddr=00171ca0 vaddr=40025bd8 size=13fe8h ( 81896) load[0m
[0;32mI (417) boot: Loaded app from partition at offset 0x40000[0m
[0;32mI (417) boot: Disabling RNG early entropy source...[0m
[0;32mI (428) cpu_start: Unicore app[0m
[0;32mI (428) cache: Instruction cache : size 8KB, 4Ways, cache line size 32Byte[0m
[0;32mI (431) cache: Data cache : size 8KB, 4Ways, cache line size 32Byte[0m
Guru Meditation Error: Core 0 panic'ed (Cache disabled but cached memory region accessed).
Core 0 register dump:
PC : 0x40039b8d PS : 0x00060d34 A0 : 0x80025ea2 A1 : 0x3fffe260
A2 : 0x02000000 A3 : 0x00000000 A4 : 0x02000000 A5 : 0x00000003
A6 : 0x00000000 A7 : 0x00000000 A8 : 0x00000014 A9 : 0x00000001
A10 : 0x00000028 A11 : 0x00000028 A12 : 0x00000800 A13 : 0x00000003
A14 : 0x00000000 A15 : 0x00000004 SAR : 0x00000007 EXCCAUSE: 0x00000007
EXCVADDR: 0x00000000 LBEG : 0x00000800 LEND : 0x00000003 LCOUNT : 0x40025314
Backtrace: 0x40039b8a:0x3fffe260 0x40025e9f:0x3fffe270 0x40085de1:0x3fffe290 0x400864a8:0x3fffe2b0 0x40025293:0x3fffe2f0 0x40050439:0x3fffe330 0x40050879:0x3fffe370 0x4004b235:0x3fffe4a0 0x40011371:0x3fffe560 0x4000f788:0x3fffe6f0 0x400072a5:0x3fffe710
@andy-danieal
Can you provide the .elf file when the issue happen?
I have sent you an email. Please check email and let me know if you have any questions.
@andy-danieal
I didn't receive any email that include a .elf file.
We sent you an email Ma li at Jul 1, 2024. Please check that email.
Also, We having the same issue with the Root Mesh node. After being active for two days, it stops working. Each 10 seconds, 30 ESP devices send a command and receive a response.
I am Ma Li. I didn't receive your email. If you prefer not to upload the elf file to GitHub, you can send it directly to [email protected].
@zhangyanjiaoesp, I have sent on email. please check it.
@andy-danieal That's weird. I didn't get your email either. You can upload your elf file here and share the link here.
And you can set the deletion conditions:
@andy-danieal Is the elf file the elf when the crash occurred? root or child ?
@andy-danieal Please use this wifi lib to test, thanks. wifi_lib_s2_0710.zip
wifi firmware version: fabad8c
@zhangyanjiaoesp, How do I add an existing project to this wifi lib.? And Can you please share the step for us?
@andy-danieal replace the wifi libs in idf/components/esp_wifi/lib/esp32s2
@zhangyanjiaoesp, We have tested the wifi lib, but we are still getting the same issue. and We noticed that the issue exists after the device has been running for 6 to 7 hours.
Also, We have attached the logs we captured using Putty. Unfortunately, we were unable to capture the entire log due to disconnections. We will attempt to recreate the issue and capture the complete log.
@zhangyanjiaoesp, We have tested the wifi lib, but we are still getting the same issue. and We noticed that the issue exists after the device has been running for 6 to 7 hours.
Also, We have attached the logs we captured using Putty. Unfortunately, we were unable to capture the entire log due to disconnections. We will attempt to recreate the issue and capture the complete log.
Ok, waiting for your logs, I have added some debug logs in the wifi lib, maybe it can help us find the root cause.
@andy-danieal Are you using the wifi lib I provided? Can you open the wifi information log? I can't get any useful information (including the debug information I added) from the log.
And you can enable the following option, thus we can get some back trace information when the crash happen.
@zhangyanjiaoesp, Thank you for your support. We have added a root node for a long-term test, and unfortunately, the original issue has not reappeared. However, we did encounter another issue.
Guru Meditation Error: Core 0 panic'ed (StoreProhibited). Exception was unhandled.
Core 0 register dump:
PC : 0x40032a62 PS : 0x00060d33 A0 : 0x80032400 A1 : 0x3ffe14e0
A2 : 0x3fff7ab4 A3 : 0x00000030 A4 : 0x3f03cac2 A5 : 0x00000006
A6 : 0x00000000 A7 : 0x3ffd5a1c A8 : 0x00000001 A9 : 0x3ffd5a1c
A10 : 0x00000002 A11 : 0x01000217 A12 : 0x00000024 A13 : 0x3ffd5a40
A14 : 0x00000001 A15 : 0x4002c0e4 SAR : 0x0000001f EXCCAUSE: 0x0000001d
EXCVADDR: 0x01000223 LBEG : 0x00000024 LEND : 0x3ffd5a40 LCOUNT : 0x40026b5c
Backtrace: 0x40032a5f:0x3ffe14e0 0x400323fd:0x3ffe1500 0x40025a05:0x3ffe1520 0x40025a60:0x3ffe1540 0x40025a95:0x3ffe1560 0x40034905:0x3ffe1580 0x400285c1:0x3ffe15a0 0x400e2d67:0x3ffe15c0 0x400e2dac:0x3ffe15e0 0x400e3872:0x3ffe1600 0x400e8259:0x3ffe17c0 0x400ea0c6:0x3ffe17e0 0x400e8467:0x3ffe1840 0x40037fa1:0x3ffe1860 0x400360a4:0x3ffe1880 0x4002e1c6:0x3ffe18b0
As per Core dump core-dump-decode.txt
==================== CURRENT THREAD STACK =====================
#0 remove_free_block (sl=1, fl=2, block=0x3fff7ab4, control=0x3ffd5a1c) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/tlsf/tlsf.c:332
#1 block_locate_free (size=<optimized out>, control=<optimized out>) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/tlsf/tlsf.c:567
#2 tlsf_malloc (tlsf=0x3ffd5a1c, size=<optimized out>) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/tlsf/tlsf.c:1005
#3 0x40032400 in multi_heap_malloc_impl (heap=0x3ffd5a08, size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/multi_heap.c:210
#4 0x40025a08 in heap_caps_malloc_base (size=48, caps=6144) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/heap_caps.c:179
#5 0x40025a63 in heap_caps_malloc (size=48, caps=6144) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/heap_caps.c:202
#6 0x40025a98 in heap_caps_malloc_default (size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/heap/heap_caps.c:228
#7 0x40034908 in malloc (size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/newlib/heap.c:24
#8 0x400285c4 in wifi_malloc (size=48) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/esp_wifi/esp32s2/esp_adapter.c:65
#9 0x400e2d6a in mesh_malloc ()
#10 0x400e2daf in esp_mesh_create_context ()
#11 0x400e3875 in esp_mesh_wifi_recv_cb ()
#12 0x400e825c in hostap_deliver_data ()
#13 0x400ea0c9 in hostap_input ()
#14 0x400e846a in ap_rx_cb ()
#15 0x40037fa4 in ppRxPkt ()
#16 0x400360a7 in ppTask ()
#17 0x4002e1c9 in vPortTaskWrapper (pxCode=0x40035fb4 <ppTask>, pvParameters=0x0) at C:/Espressif/frameworks/esp-idf-v5.2.1/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:134
@andy-danieal
please set the Heap memory debugging like the following and test again, thanks.
Thanks for reporting, will close due to short of feedback, feel free to reopen with more updates. Thanks for using our Espressif product!
@zhangyanjiaoesp, Having tested a fixed root node that hung after a 10-20 day interval, it has worked fine after being powered off. The same issue occurred on a child device when the root node hung; some devices were affected, but not all.
Attached root UART dump and log file. log.txt log-uart-dump.txt
Any Update?
@andy-danieal The current crash issue appears to be different from the previous one. Are you using the version where the debug logs were added last time?
@andy-danieal Please use this wifi lib to test, thanks. wifi_lib_s2_0710.zip
wifi firmware version: fabad8c
@zhangyanjiaoesp, Both cases threw an exception on a Memory protection fault. After that, the device continuously reset due to a Cached memory region exception. We couldn't recover the device without a power supply, which was a terrible situation for the client side.
We can't update that library on the client side, but we will demonstrate the setup and wait to reproduce the issue.
Also, We need recovery method cached memory region exception without power supply. and root cause of that error?
@andy-danieal
Although each crash is caused by accessing an illegal address, the disassembly paths differ each time. This is likely due to memory corruption, which makes it difficult to identify the root cause without a reliable reproduction method. The Cached memory region exception Issue arises from the first crash, so we need to pinpoint the root cause of the first crash. Could you provide a stable reproduction method?
Dear @zhangyanjiaoesp,
We have alloted max heap memory after that throw INVALID_MEM_TYPE exception but every time device is going to Cached memory region exception. Why isn’t it going to throw exception and Reboot Device?
Guru Meditation Error: Core 0 panic'ed (Memory protection fault).
Unknown operation at address 0xffffffff not permitted (INVALID_MEM_TYPE)
Core 0 register dump:
PC : 0x4003307b PS : 0x00060f34 A0 : 0x80032edb A1 : 0x3ffed270
--- 0x4003307b: block_set_prev_free at /home/swaroop/esp/esp-idf/components/heap/tlsf/tlsf_block_functions.h:56
(inlined by) block_mark_as_free at /home/swaroop/esp/esp-idf/components/heap/tlsf/tlsf_block_functions.h:110
(inlined by) tlsf_free at /home/swaroop/esp/esp-idf/components/heap/tlsf/tlsf.c:1202
A2 : 0x3ffd5f7c A3 : 0x3fffbc98 A4 : 0x00000000 A5 : 0x3f002900
A6 : 0x00000000 A7 : 0x3fffbc90 A8 : 0x44065d08 A9 : 0x00000000
A10 : 0x00000001 A11 : 0xffffffff A12 : 0x00000000 A13 : 0x00000000
A14 : 0x00000000 A15 : 0x3ffd6b04 SAR : 0x00000004 EXCCAUSE: 0x00000007
EXCVADDR: 0x00000000 LBEG : 0x00000000 LEND : 0x00000000 LCOUNT : 0x40025468
--- 0x40025468: xt_highint4 at /home/swaroop/esp/esp-idf/components/esp_system/port/soc/esp32s2/highint_hdl.S:52
Backtrace: 0x40033078:0x3ffed270 0x40032ed8:0x3ffed290 0x40025d7a:0x3ffed2b0 0x400979f2:0x3ffed2d0 0x40094ef5:0x3ffed2f0 0x4002e6e2:0x3ffed4a0
--- 0x40033078: block_link_next at /home/swaroop/esp/esp-idf/components/heap/tlsf/tlsf_block_functions.h:102
(inlined by) block_mark_as_free at /home/swaroop/esp/esp-idf/components/heap/tlsf/tlsf_block_functions.h:109
(inlined by) tlsf_free at /home/swaroop/esp/esp-idf/components/heap/tlsf/tlsf.c:1202
0x40032ed8: multi_heap_free_impl at /home/swaroop/esp/esp-idf/components/heap/multi_heap.c:235
0x40025d7a: heap_caps_free at /home/swaroop/esp/esp-idf/components/heap/heap_caps_base.c:70
0x400979f2: zh_vector_delete_item at /home/swaroop/Workspace/ESP32/DSPVWH_Repo/DSPVWH_Controller/components/zh_vector/zh_vector.c:201
0x40094ef5: _processing at /home/swaroop/Workspace/ESP32/DSPVWH_Repo/DSPVWH_Controller/components/zh_network/zh_network.c:335
0x4002e6e2: vPortTaskWrapper at /home/swaroop/esp/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:139
ELF file SHA256: 560d896f78ab0575
Rebooting...
ESP-ROM:esp32s2-rc4-20191025
Build:Oct 25 2019
rst:0xc (RTC_SW_CPU_RST),boot:0x8 (SPI_FAST_FLASH_BOOT)
Saved PC:0x4002563a
--- 0x4002563a: esp_restart_noos at /home/swaroop/esp/esp-idf/components/esp_system/port/soc/esp32s2/system_internal.c:117
SPIWP:0xee
mode:DIO, clock div:1
load:0x3ffe5110,len:0x1950
load:0x4004a000,len:0x4
load:0x4004a004,len:0xb38
load:0x4004e000,len:0x333c
entry 0x4004a1d0
I (26) boot: ESP-IDF v5.3.2-1-g9a6aba7401 2nd stage bootloader
I (26) boot: compile time Dec 15 2024 15:11:43
I (27) boot: chip revision: v1.0
I (31) boot: efuse block revision: v0.2
I (35) boot.esp32s2: SPI Speed : 80MHz
I (40) boot.esp32s2: SPI Mode : DIO
I (45) boot.esp32s2: SPI Flash Size : 4MB
I (50) boot: Enabling RNG early entropy source...
I (55) boot: Partition Table:
I (58) boot: ## Label Usage Type ST Offset Length
I (66) boot: 0 nvs WiFi data 01 02 0000d000 00004000
I (73) boot: 1 otadata OTA data 01 00 00011000 00002000
I (81) boot: 2 phy_init RF data 01 01 00013000 00001000
I (88) boot: 3 edf WiFi data 01 02 00014000 00004000
I (96) boot: 4 storage Unknown data 01 81 00018000 00020000
I (103) boot: 5 ota_0 OTA app 00 10 00040000 00170000
I (111) boot: 6 ota_1 OTA app 00 11 001b0000 00170000
I (118) boot: End of partition table
I (123) esp_image: segment 0: paddr=00040020 vaddr=3f000020 size=49088h (299144) map
I (190) esp_image: segment 1: paddr=000890b0 vaddr=3ffcb640 size=04244h ( 16964) load
I (194) esp_image: segment 2: paddr=0008d2fc vaddr=40024000 size=02d1ch ( 11548) load
I (198) esp_image: segment 3: paddr=00090020 vaddr=40080020 size=dfa70h (916080) map
I (384) esp_image: segment 4: paddr=0016fa98 vaddr=40026d1c size=14920h ( 84256) load
I (416) boot: Loaded app from partition at offset 0x40000
I (416) boot: Disabling RNG early entropy source...
I (428) cache: Instruction cache : size 8KB, 4Ways, cache line size 32Byte
I (428) cache: Data cache : size 8KB, 4Ways, cache line size 32Byte
I (434) cpu_start: Unicore app
Guru Meditation Error: Core 0 panic'ed (Cache disabled but cached memory region accessed).
@andy-danieal Did you enable the CONFIG_HEAP_POISONING_COMPREHENSIVE option during testing?
@andy-danieal Do you still have this problem?
Thanks for reporting, will close due to inactivity, feel free to reopen if the issue still happens, thanks.