esp32: fix a crash with PSRAM + SMP
Summary
this function is called via esp_spiram_init_cache early in the boot.
Impact
Testing
esp32-devkitc:psram + smp enabled
Thanks for submitting it, @yamt !
I will test it on our internal CI. Please don't merge it yet.
Hi @yamt
If I set CONFIG_DEBUG_ASSERTIONS and CONFIG_DEBUG_FEATURES it fails to boot. I guess it's somehow debug-asserting. Can you take a look at it?
The defconfig I used for it:
#
# This file is autogenerated: PLEASE DO NOT EDIT IT.
#
# You can use "make menuconfig" to make any modifications to the installed .config file.
# You can then do "make savedefconfig" to generate a new defconfig file that includes your
# modifications.
#
# CONFIG_ARCH_LEDS is not set
# CONFIG_NSH_ARGCAT is not set
# CONFIG_NSH_CMDOPT_HEXDUMP is not set
CONFIG_ARCH="xtensa"
CONFIG_ARCH_BOARD="esp32-devkitc"
CONFIG_ARCH_BOARD_COMMON=y
CONFIG_ARCH_BOARD_ESP32_DEVKITC=y
CONFIG_ARCH_CHIP="esp32"
CONFIG_ARCH_CHIP_ESP32=y
CONFIG_ARCH_CHIP_ESP32WROVER=y
CONFIG_ARCH_INTERRUPTSTACK=2048
CONFIG_ARCH_STACKDUMP=y
CONFIG_ARCH_XTENSA=y
CONFIG_BOARD_LOOPSPERMSEC=16717
CONFIG_BUILTIN=y
CONFIG_DEBUG_ASSERTIONS=y
CONFIG_DEBUG_ASSERTIONS_EXPRESSION=y
CONFIG_DEBUG_FEATURES=y
CONFIG_DEBUG_FULLOPT=y
CONFIG_DEBUG_SYMBOLS=y
CONFIG_DEV_ZERO=y
CONFIG_ESP32_IMM_HEAP=y
CONFIG_ESP32_SPIRAM=y
CONFIG_ESP32_UART0=y
CONFIG_FS_PROCFS=y
CONFIG_HAVE_CXX=y
CONFIG_HAVE_CXXINITIALIZE=y
CONFIG_HEAP2_BASE=0x3f800000
CONFIG_HEAP2_SIZE=4194304
CONFIG_IDLETHREAD_STACKSIZE=3072
CONFIG_INIT_ENTRYPOINT="nsh_main"
CONFIG_INIT_STACKSIZE=3072
CONFIG_INTELHEX_BINARY=y
CONFIG_IOB_NBUFFERS=36
CONFIG_IOB_NCHAINS=36
CONFIG_IOB_THROTTLE=8
CONFIG_MM_IOB=y
CONFIG_MM_REGIONS=4
CONFIG_NDEBUG=y
CONFIG_NSH_ARCHINIT=y
CONFIG_NSH_BUILTIN_APPS=y
CONFIG_NSH_FILEIOSIZE=512
CONFIG_NSH_LINELEN=64
CONFIG_NSH_READLINE=y
CONFIG_PREALLOC_TIMERS=4
CONFIG_RAM_SIZE=114688
CONFIG_RAM_START=0x20000000
CONFIG_RR_INTERVAL=200
CONFIG_SCHED_HPWORK=y
CONFIG_SCHED_WAITPID=y
CONFIG_SMP=y
CONFIG_SMP_NCPUS=2
CONFIG_START_DAY=6
CONFIG_START_MONTH=12
CONFIG_START_YEAR=2011
CONFIG_SYSLOG_BUFFER=y
CONFIG_SYSTEM_NSH=y
CONFIG_TESTING_MM=y
CONFIG_UART0_SERIAL_CONSOLE=y
Can you also enable SMP for the
psramdefconfig`, please?
why? how about psram_usrheap?
Can you also enable SMP for the
psramdefconfig`, please?why? how about psram_usrheap?
It's up to you ;) I would enable it too if it works smoothly...
Hi @yamt
If I set
CONFIG_DEBUG_ASSERTIONSandCONFIG_DEBUG_FEATURESit fails to boot. I guess it's somehow debug-asserting. Can you take a look at it?The
defconfigI used for it:
there seems to be a few issues.
- you should bump CONFIG_MM_REGIONS.
- https://github.com/apache/nuttx/pull/13233
- for some reason, the last page is corrupted.
the following is with CONFIG_MM_FILL_ALLOCATIONS=y. thus i expect it's all 0xcc. but as you can see there are some 32byte patterns. (32byte because of EVENODD mode i guess) with NORMAL mode, the symptom disappears. but i guess NORMAL mode doen't meet the general coherency expectations for nuttx smp. do you have any ideas?
nsh> xd 3fbf7000 256
Hex dump:
0000: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0010: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0020: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0030: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0040: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0050: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0060: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0070: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0080: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0090: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00a0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00b0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00c0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00d0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00e0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00f0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
nsh>
nsh> xd 3fbf8000 256
Hex dump:
0000: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0010: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0020: 55 55 04 15 55 55 45 15 55 55 55 15 55 54 14 44 UU..UUE.UUU.UT.D
0030: 55 55 45 15 55 50 14 51 05 54 45 55 55 45 51 55 UUE.UP.Q.TEUUEQU
0040: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0050: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0060: 15 55 55 55 55 55 45 55 45 55 05 45 55 54 55 50 .UUUUUEUEU.EUTUP
0070: 45 55 54 55 55 55 55 15 55 40 55 55 55 04 45 55 [email protected]
0080: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
0090: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00a0: 54 54 55 04 54 51 55 55 54 55 11 45 45 05 00 55 TTU.TQUUTU.EE..U
00b0: 50 54 45 55 45 55 51 04 14 45 55 01 55 41 55 41 PTEUEUQ..EU.UAUA
00c0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00d0: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
00e0: 45 15 54 54 45 55 51 15 55 55 55 11 45 51 55 55 E.TTEUQ.UUU.EQUU
00f0: 54 55 55 11 55 51 55 55 55 45 55 55 54 44 50 54 TUU.UQUUUEUUTDPT
nsh>
Hi @yamt ,
there seems to be a few issues.
- you should bump CONFIG_MM_REGIONS.
Yes, this is the cause for failing the boot.
Although this change makes sense, I didn't find any issues related direct to it.
- for some reason, the last page is corrupted.
the following is with CONFIG_MM_FILL_ALLOCATIONS=y. thus i expect it's all 0xcc. but as you can see there are some 32byte patterns. (32byte because of EVENODD mode i guess) with NORMAL mode, the symptom disappears. but i guess NORMAL mode doen't meet the general coherency expectations for nuttx smp. do you have any ideas?
I have no idea. After increasing the CONFIG_MM_REGIONS, it was able to boot, but free command fails:
[CPU0] _assert: Assertion failed info.uordblks + info.fordblks == info.arena: at file: mm_heap/mm_mallinfo.c:157
In fact, if we disable CONFIG_DEBUG_ASSERTIONS, free returns:
nsh> free
total used free maxused maxfree nused nfree
esp32-imem: 98300 8628 89672 9024 89672 5 1
Umem: 4402076 4263516 4398464 4008 4194288 27 5
It isn't evaluating the used/free size correctly.
Although this change makes sense, I didn't find any issues related direct to it.
it prevents DIAGASSERT on CONFIG_MM_REGIONS from showing the panic messages.
- for some reason, the last page is corrupted.
the following is with CONFIG_MM_FILL_ALLOCATIONS=y. thus i expect it's all 0xcc. but as you can see there are some 32byte patterns. (32byte because of EVENODD mode i guess) with NORMAL mode, the symptom disappears. but i guess NORMAL mode doen't meet the general coherency expectations for nuttx smp. do you have any ideas?
I have no idea. After increasing the
CONFIG_MM_REGIONS, it was able to boot, butfreecommand fails:[CPU0] _assert: Assertion failed info.uordblks + info.fordblks == info.arena: at file: mm_heap/mm_mallinfo.c:157In fact, if we disable
CONFIG_DEBUG_ASSERTIONS,freereturns:nsh> free total used free maxused maxfree nused nfree esp32-imem: 98300 8628 89672 9024 89672 5 1 Umem: 4402076 4263516 4398464 4008 4194288 27 5It isn't evaluating the used/free size correctly.
it's because the last page corruption.
it's because the last page corruption.
I see... I will create an issue to keep track of this, but we have no schedule for this issue. If you find the root cause, we can help testing it on our CI too. For now, I don't think the PSRAM support is fully functional for ESP32 with SMP enabled.
it's because the last page corruption.
I see... I will create an issue to keep track of this, but we have no schedule for this issue. If you find the root cause, we can help testing it on our CI too. For now, I don't think the PSRAM support is fully functional for ESP32 with SMP enabled.
does it work for esp-idf well?
it's because the last page corruption.
I see... I will create an issue to keep track of this, but we have no schedule for this issue. If you find the root cause, we can help testing it on our CI too. For now, I don't think the PSRAM support is fully functional for ESP32 with SMP enabled.
does it work for esp-idf well?
Yes. We have it supported if I'm not mistaken.
it's because the last page corruption.
I see... I will create an issue to keep track of this, but we have no schedule for this issue. If you find the root cause, we can help testing it on our CI too. For now, I don't think the PSRAM support is fully functional for ESP32 with SMP enabled.
does it work for esp-idf well?
Yes. We have it supported if I'm not mistaken.
Ok. So I guess something wrong in nuttx.
Btw, as you know, nuttx esp32 code and esp-idf have a lot of similarlity. Do you know what's the situation about the copyright here? I'm wondering what I can copy and what I can't.
Btw, as you know, nuttx esp32 code and esp-idf have a lot of similarlity. Do you know what's the situation about the copyright here? I'm wondering what I can copy and what I can't.
IDF is licensed under Apache 2.0 License, so there is no problem "copying" code to NuttX. Please note, however, the third party repositories. You can check more details here: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/COPYRIGHT.html
Btw, as you know, nuttx esp32 code and esp-idf have a lot of similarlity. Do you know what's the situation about the copyright here? I'm wondering what I can copy and what I can't.
IDF is licensed under Apache 2.0 License, so there is no problem "copying" code to NuttX. Please note, however, the third party repositories. You can check more details here: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/COPYRIGHT.html
don't we need to inherit SPDX-FileCopyrightText: 2018-2024 Espressif Systems (Shanghai) CO LTD?
it's because the last page corruption.
I see... I will create an issue to keep track of this, but we have no schedule for this issue. If you find the root cause, we can help testing it on our CI too. For now, I don't think the PSRAM support is fully functional for ESP32 with SMP enabled.
i guess https://github.com/apache/nuttx/pull/13249 is the root cause.
Btw, as you know, nuttx esp32 code and esp-idf have a lot of similarlity. Do you know what's the situation about the copyright here? I'm wondering what I can copy and what I can't.
IDF is licensed under Apache 2.0 License, so there is no problem "copying" code to NuttX. Please note, however, the third party repositories. You can check more details here: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/COPYRIGHT.html
don't we need to inherit
SPDX-FileCopyrightText: 2018-2024 Espressif Systems (Shanghai) CO LTD?
As far as I understand, it's enough to add a notice in the NOTICE file (according to https://www.apache.org/licenses/LICENSE-2.0.html). @jerpelea , can you help us here?
it's because the last page corruption.
I see... I will create an issue to keep track of this, but we have no schedule for this issue. If you find the root cause, we can help testing it on our CI too. For now, I don't think the PSRAM support is fully functional for ESP32 with SMP enabled.
i guess #13249 is the root cause.
right, I will test asap...
Hi @yamt
CI tests ran successfully. Let's wait for https://github.com/apache/nuttx/pull/13249 to be merged before merging this.
I tested i with the following configs (for psram and psram_usrheap). Can you please add them to this PR?
Usually, whenever we check that some defconfig is working as expected with SMP enabled, we let it be enabled in order to provide the best experience for the users. So, if you can enable them in this PR, I'll be glad. Thank you for submitting these PRs.
diff --git a/boards/xtensa/esp32/esp32-devkitc/configs/psram/defconfig b/boards/xtensa/esp32/esp32-devkitc/configs/psram/defconfig
index 9450b5475e..51de22702f 100644
--- a/boards/xtensa/esp32/esp32-devkitc/configs/psram/defconfig
+++ b/boards/xtensa/esp32/esp32-devkitc/configs/psram/defconfig
@@ -6,6 +6,7 @@
# modifications.
#
# CONFIG_ARCH_LEDS is not set
+# CONFIG_NDEBUG is not set
# CONFIG_NSH_ARGCAT is not set
# CONFIG_NSH_CMDOPT_HEXDUMP is not set
CONFIG_ARCH="xtensa"
@@ -15,6 +16,7 @@ CONFIG_ARCH_BOARD_ESP32_DEVKITC=y
CONFIG_ARCH_CHIP="esp32"
CONFIG_ARCH_CHIP_ESP32=y
CONFIG_ARCH_CHIP_ESP32WROVER=y
+CONFIG_ARCH_INTERRUPTSTACK=2048
CONFIG_ARCH_STACKDUMP=y
CONFIG_ARCH_XTENSA=y
CONFIG_BOARD_LOOPSPERMSEC=16717
@@ -36,7 +38,7 @@ CONFIG_IOB_NBUFFERS=36
CONFIG_IOB_NCHAINS=36
CONFIG_IOB_THROTTLE=8
CONFIG_MM_IOB=y
-CONFIG_MM_REGIONS=4
+CONFIG_MM_REGIONS=5
CONFIG_NSH_ARCHINIT=y
CONFIG_NSH_BUILTIN_APPS=y
CONFIG_NSH_FILEIOSIZE=512
@@ -48,6 +50,8 @@ CONFIG_RAM_START=0x20000000
CONFIG_RR_INTERVAL=200
CONFIG_SCHED_HPWORK=y
CONFIG_SCHED_WAITPID=y
+CONFIG_SMP=y
+CONFIG_SMP_NCPUS=2
CONFIG_START_DAY=6
CONFIG_START_MONTH=12
CONFIG_START_YEAR=2011
diff --git a/boards/xtensa/esp32/esp32-devkitc/configs/psram_usrheap/defconfig b/boards/xtensa/esp32/esp32-devkitc/configs/psram_usrheap/defconfig
index 171374f6b0..ea09820207 100644
--- a/boards/xtensa/esp32/esp32-devkitc/configs/psram_usrheap/defconfig
+++ b/boards/xtensa/esp32/esp32-devkitc/configs/psram_usrheap/defconfig
@@ -6,6 +6,7 @@
# modifications.
#
# CONFIG_ARCH_LEDS is not set
+# CONFIG_NDEBUG is not set
# CONFIG_NSH_ARGCAT is not set
# CONFIG_NSH_CMDOPT_HEXDUMP is not set
CONFIG_ARCH="xtensa"
@@ -15,6 +16,7 @@ CONFIG_ARCH_BOARD_ESP32_DEVKITC=y
CONFIG_ARCH_CHIP="esp32"
CONFIG_ARCH_CHIP_ESP32=y
CONFIG_ARCH_CHIP_ESP32WROVER=y
+CONFIG_ARCH_INTERRUPTSTACK=2048
CONFIG_ARCH_STACKDUMP=y
CONFIG_ARCH_XTENSA=y
CONFIG_BOARD_LOOPSPERMSEC=16717
@@ -38,7 +40,7 @@ CONFIG_IOB_NBUFFERS=36
CONFIG_IOB_NCHAINS=36
CONFIG_IOB_THROTTLE=8
CONFIG_MM_IOB=y
-CONFIG_MM_REGIONS=3
+CONFIG_MM_REGIONS=4
CONFIG_NSH_ARCHINIT=y
CONFIG_NSH_BUILTIN_APPS=y
CONFIG_NSH_FILEIOSIZE=512
@@ -50,6 +52,8 @@ CONFIG_RAM_START=0x20000000
CONFIG_RR_INTERVAL=200
CONFIG_SCHED_HPWORK=y
CONFIG_SCHED_WAITPID=y
+CONFIG_SMP=y
+CONFIG_SMP_NCPUS=2
CONFIG_START_DAY=6
CONFIG_START_MONTH=12
CONFIG_START_YEAR=2011
I may have found a bug here. Let me debug it properly.
Confirmed. I'm sorry for the confusion. The tests regarding ESP32 + SMP without PSRAM failed because the change in https://github.com/apache/nuttx/pull/13249 is valid only when PSRAM is enabled, so we need to make it part of the general case. Can you commit the following to revert it and place it in the correct position:
Author: Tiago Medicci Serrano <[email protected]>
Date: Fri Aug 30 15:11:59 2024 -0300
esp32: fix initialization with PSRAM + SMP
Cache flush must be done prior to the APP cpu initalization. This,
however, must be true for the case where PSRAM is not available or
not selected. To do that, this commit flushs the cache during the
device initialization.
diff --git a/arch/xtensa/src/common/espressif/esp_loader.c b/arch/xtensa/src/common/espressif/esp_loader.c
index bf21c5f096..ebbf128044 100644
--- a/arch/xtensa/src/common/espressif/esp_loader.c
+++ b/arch/xtensa/src/common/espressif/esp_loader.c
@@ -273,8 +273,12 @@ int map_rom_segments(uint32_t app_drom_start, uint32_t app_drom_vaddr,
#endif
#ifdef CONFIG_ARCH_CHIP_ESP32
- cache_read_disable(0);
- cache_flush(0);
+ cache_read_disable(PRO_CPU_NUM);
+ cache_flush(PRO_CPU_NUM);
+# ifdef CONFIG_SMP
+ cache_flush(APP_CPU_NUM);
+ cache_read_enable(APP_CPU_NUM);
+# endif
#else
cache_hal_disable(CACHE_TYPE_ALL);
#endif
diff --git a/arch/xtensa/src/esp32/esp32_spiram.c b/arch/xtensa/src/esp32/esp32_spiram.c
index eed50e1fff..7eaa5295ec 100644
--- a/arch/xtensa/src/esp32/esp32_spiram.c
+++ b/arch/xtensa/src/esp32/esp32_spiram.c
@@ -68,13 +68,6 @@
# error "FLASH speed can only be equal to or higher than SRAM speed while SRAM is enabled!"
#endif
-/****************************************************************************
- * ROM Function Prototypes
- ****************************************************************************/
-
-extern void cache_flush(int cpu);
-extern void cache_read_enable(int cpu);
-
/****************************************************************************
* Private Data
****************************************************************************/
@@ -253,8 +246,6 @@ void IRAM_ATTR esp_spiram_init_cache(void)
/* Flush and enable icache for APP CPU */
#ifdef CONFIG_SMP
- cache_flush(APP_CPU_NUM);
- cache_read_enable(APP_CPU_NUM);
regval = getreg32(DPORT_APP_CACHE_CTRL1_REG);
regval &= ~(1 << DPORT_APP_CACHE_MASK_DRAM1);
putreg32(regval, DPORT_APP_CACHE_CTRL1_REG);
Can you also enable SMP for the
psramdefconfig`, please?why? how about psram_usrheap?
It's up to you ;) I would enable it too if it works smoothly...
done
I may have found a bug here. Let me debug it properly.
Confirmed. I'm sorry for the confusion. The tests regarding ESP32 + SMP without PSRAM failed because the change in #13249 is valid only when PSRAM is enabled, so we need to make it part of the general case. Can you commit the following to revert it and place it in the correct position:
i created the commit in this PR. (is this what you meant? i'm not entirely sure why you didn't submit a separate PR by yourself.) when creating the commit, i tried to preserve metadata. but there might be some botches. it would be simpler to use git-format-patch next time.
btw, are these rom functions documented somewhere? i was just making wild guesses from their names.
@tmedicci I confirmed that this PR fixes https://github.com/apache/nuttx/pull/13249#issuecomment-2322325239.