ESP32 crash "Cache disabled but cached memory region accessed)"
I experience frequent crashes using FastAccelStepper, that are hard to debug. The crash itself is manifested on serial output as:
Guru Meditation Error: Core 1 panic'ed (Cache disabled but cached memory region accessed).
However, the stack is corrupted and so a full stack trace is unavailable. The current PC at crash points to gpio_set_level, out of esp-idf/components/driver/gpio.c:229
My observations / thoughts so far:
- Crashes can be prevented by removing stepper->moveTo calls from my code
- Crashes can be prevented by feeding stepper->moveTo calls with constant position
- The current PC at crash points to gpio_set_level
- As there are no other GPIO changes done by my code, this pins down the ones done in FastAccelStepper - which are all direction pin changes in my case.
- "Cache disabled but cached memory region accessed" seems often to be related to flash access, which my code does frequently by using Prefs / LittleFS libraries.
So my guess is FastAccelStepper may run into issues when calling direction pin gpio_set_level without some precautions regarding the data accessed.
I've made some further tests: calling moveTo without affecting motor direction, eg. just going up for some time, runs stable (for at least one hour tested). This again hints at gpio_set_level used for the dir pin as location of crash.
As first step, try to stop the motor before writing to flash: „Cache disabled but cached memory region accessed In some situations, ESP-IDF will temporarily disable access to external SPI Flash and SPI RAM via caches. For example, this happens when spi_flash APIs are used to read/write/erase/mmap regions of SPI Flash.“ In my experience, the stepper makes sudden pauses while writing to flash during OTA (no steps lost though :-) ), because the StepperTask cannot fill the queue while writing. And I doubt, there is any possibility to make this work reliably.
Debugging on esp32 is not trivial and the Guru Meditations are often not helpful. Apparently, gpio_set_level is not safe to be called from an interrupt, what FastAccelStepper does for direction changes. There is a flag:
This function is allowed to be executed when Cache is disabled within ISR context, by enabling CONFIG_GPIO_CTRL_FUNC_IN_IRAM. Do you know, if this is enabled in your environment?
Another alternative is to define dir pin as external pin and in the callback use digitalWrite or similar. The difference is, that external pin handler is not run in context of an ISR.
There is a flag:
This function is allowed to be executed when Cache is disabled within ISR context, by enabling CONFIG_GPIO_CTRL_FUNC_IN_IRAM. Do you know, if this is enabled in your environment?
I've tried to change that, but esp idf is prebuilt in "arduino" environment. And unfortunately I was not able to change the whole build environment to esp_idf successfully and getting my project build. After some struggling, I reverted to esp32 arduino environment, guess I am stuck there.
However,
Another alternative is to define dir pin as external pin and in the callback use digitalWrite or similar. The difference is, that external pin handler is not run in context of an ISR.
was a good advice! I've tried that and no crashes so far.
What are the caveats running the dir pin in this fashion?
this workaround is a supported method.
caveat: it may insert an additional delay of MIN _CMD_TICKS on direction changes. In real application the difference is not expected to be noticed, unless the external pin-handler returns for a long time the old status and gets repeatedly called every 4ms.
Seems to work so far. So is this documented somewhere? To prevent future bugs filed on the same issue.. Or maybe ESP32 version could use interrupt cache safe code in the first place, with the option to use unsafe one for better performance?
Otherwise this triggers hard to debug bugs. For example, my code does not write, only read flash while running into this issues. Also due to stack corruption, I would have a hard time pinning that down on code that would use other GPIO operations. So it was easy to suspect FastAccelStepper code, but that would not always be the case.
There is a new branch gpio_ll. The esp32 rmt and mcpwm/pcnt driver has been changed to use gpio_ll functions, which should be inlined. Please give this a try. Perhaps this fixes the problem without further intervention from the app developer.
it is now implemented in the new release v0.31.0