esp-storage
esp-storage copied to clipboard
Does one have to disable interrupts before using `FlashStorage::write`?
I'm seeing this backtrace on a ESP32S3 when ISRs are being ran on the same core:
Backtrace
Exception occured 'IllegalInstruction'
Context
PC=0x4201219e PS=0x00060030
0x4201219e - core::sync::atomic::atomic_load
at C:\Users\d.polonski\.rustup\toolchains\esp\lib\rustlib\src\rust\library\core\src\sync\atomic.rs:3153
0x00060030 - PS_WOE
at ??:??
A0=0x82006b38 A1=0x3fcda550 A2=0x00000000 A3=0x3fcda630 A4=0x00000000
0x82006b38 - _rtc_fast_bss_end
at ??:??
0x3fcda550 - _heap_end
at ??:??
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x3fcda630 - _heap_end
at ??:??
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
A5=0x41200000 A6=0x00000000 A7=0x3fcda550 A8=0x00000023 A9=0x3fcda530
0x41200000 - __default_double_exception
at ??:??
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x3fcda550 - _heap_end
at ??:??
0x00000023 - PS_UM
at ??:??
0x3fcda530 - _heap_end
at ??:??
A10=0x00000000 A11=0x00000000 A12=0xbf0d0690 A13=0x6000880c A14=0x801702a5
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0xbf0d0690 - _rtc_fast_bss_end
at ??:??
0x6000880c - _rtc_slow_bss_end
at ??:??
0x801702a5 - _rtc_fast_bss_end
at ??:??
A15=0x3fcda530
0x3fcda530 - _heap_end
at ??:??
SAR=0000001f
EXCCAUSE=0x00000000 EXCVADDR=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
LBEG=0x00000000 LEND=0x00000000 LCOUNT=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
THREADPTR=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
SCOMPARE1=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
BR=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
ACCLO=0x00000000 ACCHI=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
M0=0x00000000 M1=0x00000000 M2=0x00000000 M3=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
F64R_LO=0x000000ad F64R_HI=0x00000000 F64S=0x3c0331f4
0x000000ad - XT_STK_F5
at ??:??
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x3c0331f4 - _ZN15core_control_v43adc23ADC_PLAUSIBILITY_RANGES17h889f840f788e9dafE
at ??:??
FCR=0x00000000 FSR=0x00000080
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000080 - XT_STK_M3
at ??:??
F0=0x00000000 F1=0x00000000 F2=0x00000000 F3=0x00000000 F4=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
F5=0x00000000 F6=0x00000000 F7=0x00000000 F8=0x00000000 F9=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
F10=0x3c2237c3 F11=0x43990677 F12=0x41e88f61 F13=0x00000000 F14=0x00000000
0x3c2237c3 - _sidata
at ??:??
0x43990677 - _ZN17compiler_builtins3mem6memcmp17hf37f2c57db2018e8E
at ??:??
0x41e88f61 - __default_double_exception
at ??:??
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
F15=0x00000000
0x00000000 - <esp32s3::RTC_CNTL as core::fmt::Debug>::fmt
at C:\Users\d.polonski\.cargo\registry\src\index.crates.io-6f17d22bba15001f\esp32s3-0.19.0\src\lib.rs:1388
0x42019b99
0x42019b99 - core::fmt::Arguments::new_const
at C:\Users\d.polonski\.rustup\toolchains\esp\lib\rustlib\src\rust\library\core\src\fmt\mod.rs:301
0x40000000
0x40000000 - _external_ram_end
at ??:??
0x40034c48
0x40034c48 - rom_i2c_writeReg_Mask
at ??:??
0x40000000
0x40000000 - _external_ram_end
at ??:??
I'd expect the ROM functions to disable interrupts on their own - plus by default we call them in a critical-section 🤔
Is there anything special in your code which causes this problem? I could try to reproduce it on my own but maybe it's easier if you can share the exact steps
I'm a little limited on time right now to provide a full reproduction. The general setup is: main core:
- do the
FlashStorage::write - run TWAI ISRs, possibly calling
critical_section::with
second core:
- run other code, possibly calling
critical_section::with
Ah ... I can imagine interrupts triggered on the other core to cause problems while one core is accessing the (detached) flash. This probably needs some more synchronization work between the cores then
Now after further thinking: We need to completely halt the other core during flash access
A bit of further reading here: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/spi_flash/spi_flash_concurrency.html and https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/spi_flash/index.html#implementation-details.
Interrupts can fire but all code must be in IRAM already. And like @bjoernQ already said, both cores must be stalled/spinning until the flash operation is complete.
I think the best bet, for now, is to disable interrupts and make the core we want to run the flash operation on spin until the other core has acquired some mutex or something to signal that it will wait until we're done with flash. All of this is actually quite involved, and will mostly likely need some help from esp-hal?
This also probably has implications for esp-wifi and its scheduler too, we don't want to starve that completely, fun problem to solve I guess :D.
Slightly off-topic but how does one put code in IRAM in Rust? Is it even possible for library code one does not control?
With great difficulty, there is no way (in GCC or Rust) to recursively apply a linker_section attribute. This means whilst a top level function may be put inside an IRAM section, it doesn't mean function calls inside the function will be either. You have to check yourself that all code ends up in IRAM. In esp-idf there is a fuzzy checker which tries to warn you if you call code which isn't in IRAM from an IRAM placed function.
I think I remember I have seen code in esp-idf where they just paused and unpaused the other core. But maybe I'm wrong .... Need to check that.
Even if it runs non interrupt code it might cross a cache page. It's certainly very bad if whatever the other core runs is very time critical. In that case everything on the other core should be in RAM. So probably pausing the other core should be an optional default
You could also try to manually pause the other core in the meantime and see if that helps.
Regarding recursively checking the call tree for non-IRAM functions we could try what cargo-call-stack does (I think compiling to WASM and parse that ... not sure I remember it correctly but definitely not fun).
Found the CPU pausing in NuttX
https://github.com/apache/nuttx/blob/1a8027d6250dc7bf2156e293d453a611cd8dce35/arch/xtensa/src/esp32/esp32_spiflash.c#L473