esp-adf icon indicating copy to clipboard operation
esp-adf copied to clipboard

[ESP-ADF/ESP32-S3] FreeRTOS Fatal Error (TLSP Corrupted) When Terminating an Audio Pipeline After Wake-up (【ESP-ADF/ESP32-S3】在唤醒后销毁音频管道时,引发 FreeRTOS 致命错误(TLSP 损坏)) (AUD-6465)

Open solomonzw opened this issue 6 months ago • 2 comments

Hello Espressif Team,

We have encountered a persistent and reproducible FreeRTOS: Fatal error on an ESP32-S3 when using the ESP-ADF framework. The error is TLSP deletion callback at index 0 overwritten with non-excutable pointer 0xcececece, which occurs specifically during the cleanup of a task associated with a temporary audio pipeline.

We believe this might be a bug within the ESP-ADF framework, possibly in how audio elements or pipelines are de-initialized, leading to memory corruption in the FreeRTOS Task Control Block.

(你好,乐鑫团队,

我们在 ESP32-S3 上使用 ESP-ADF 框架时,遇到了一个持续且可复现的 FreeRTOS: Fatal error。错误为 TLSP deletion callback at index 0 overwritten with non-excutable pointer 0xcececece,它总是在一个临时的音频管道任务被清理时发生。

我们认为这可能是 ESP-ADF 框架自身的一个 Bug,可能是在某个音频元件或管道被销毁时,其内部的清理逻辑不当,导致了 FreeRTOS 任务控制块的内存损坏。)

Environment (开发环境) ADF Version (ADF版本): (请在这里填写你使用的 ESP-ADF 版本, e.g., master or v2.6) IDF Version (IDF版本): v5.4.1 (我们的日志中确认) Development Board (开发板): M5Stack AtomS3R Audio Codec (音频芯片): ES8311 Project (项目): An real-time communication application using a wake-word engine and the VolcEngine RTC SDK. (一个使用唤醒词引擎和火山引擎RTC SDK的实时语音通信应用。) Problem Description (问题描述) The system is designed to work in a wake-up mode. The normal workflow is as follows:

The device boots, connects to the RTC service, and enters a state of waiting for a wake-word. When the wake-word is detected, the rec_engine_cb callback is triggered. Inside this callback, the main player pipeline is stopped (player_pipeline_stop), and a temporary pipeline is created to play a prompt tone (audio_tone_play which uses a wav_decoder and i2s_stream). After the prompt tone finishes playing, the temporary pipeline is terminated (audio_pipeline_terminate). Immediately after this termination, the prvIdleTask in FreeRTOS discovers that the Task Control Block (TCB) of the just-deleted task has been corrupted. Specifically, the Thread-Local Storage Pointer (TLSP) deletion callback points to 0xcececece. This triggers a fatal error and reboots the system. (系统在唤醒模式下工作。当检测到唤醒词后,程序会暂停主播放管道,然后创建并运行一个临时的管道来播放提示音。在这个临时管道播放完毕并被销毁后,FreeRTOS 的 prvIdleTask 在清理该任务的资源时,会发现其 TCB 中的 TLSP 清理回调指针已被破坏,内容为 0xcececece,从而引发致命错误并重启。)

Final Reproducible Log (最终可复现的日志) This is the final crash log after enabling heap corruption detection and stack overflow checking. The 0xcececece value strongly suggests a use-after-free or heap corruption issue.

(这是在开启了堆栈和堆内存调试功能后,最终稳定复现的日志。0xcececece 这个值强烈暗示了这是一个 use-after-free 或堆内存损坏问题。)

I (124186) AUDIO_PROCESSOR: [ * ] Stop event received E (124186) FreeRTOS: Fatal error: TLSP deletion callback at index 0 overwritten with non-excutable pointer 0xcececece

abort() was called at PC 0x403846a8 on core 0 --- 0x403846a8: vPortTLSPointersDelCb at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:612 ---  (inlined by) vPortTCBPreDeleteHook at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:664

Backtrace: 0x4037a115:0x3fcc2cd0 0x4038374d:0x3fcc2cf0 0x4038b89e:0x3fcc2d10 0x403846a8:0x3fcc2d80 0x40385772:0x3fcc2da0 0x40385836:0x3fcc2dc0      --- 0x4037a115: panic_abort at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/esp_system/panic.c:454 --- 0x4038374d: esp_system_abort at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/esp_system/port/esp_system_chip.c:87 --- 0x4038b89e: abort at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/newlib/abort.c:38 --- 0x403846a8: vPortTLSPointersDelCb at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:612 ---  (inlined by) vPortTCBPreDeleteHook at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:664 --- 0x40385772: prvDeleteTCB at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:4939 --- 0x40385836: prvCheckTasksWaitingTermination at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:4648 ---  (inlined by) prvIdleTask at G:/ESP32IDE/ESP32_5_4_1/v5.4.1/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:4346 What We Have Tried (我们已尝试的解决方案) We have tried numerous application-level fixes, none of which have solved the underlying memory corruption:

Increasing main task and timer task stack sizes to 8192 and 4096 respectively. Adding delays (vTaskDelay) of up to 1 second between pipeline operations to rule out simple race conditions. Refactoring the application logic to use a single "manager task" with a command queue to serialize all audio pipeline operations (stop, run, play_tone). Enabling "Comprehensive" Heap Corruption Detection and "Canary Bytes" Stack Overflow Checking. These tools did not trigger a different crash, but the 0xcececece address indicates the heap poisoning mechanism is working to mark freed memory. A workaround that prevents the crash is to modify the audio_tone_play function to not call audio_pipeline_terminate. This causes a memory leak but allows the rest of the application to function, strongly suggesting the bug lies in the pipeline termination/task cleanup process.

(我们已经尝试了所有应用层的修复方案,但都未能解决问题,包括:增加任务堆栈、增加延时解决竞态、重构代码使用单一管理任务序列化所有操作、开启所有内存调试工具。唯一能避免崩溃的“绕过”方法,是在播放完提示音后,不调用 audio_pipeline_terminate 来销毁管道。这会造成内存泄漏,但证明了 Bug 出在管道的销毁流程中。)

We hope this detailed report helps in identifying a potential bug in the ESP-ADF framework. Thank you for your time and support.

(希望这份详细的报告能帮助定位 ESP-ADF 框架中可能存在的 Bug。感谢您的时间和支持。)

solomonzw avatar Jun 22 '25 12:06 solomonzw

@solomonzw What version of ADF did you use?

jason-mao avatar Jun 23 '25 08:06 jason-mao

您使用的 ADF 版本是什么?

G:\ESP32-ADF\esp-adf>git describe --tags v2.7-105-g4200c64d

solomonzw avatar Jun 24 '25 15:06 solomonzw