ESP RTC crashing due to heap corruption after several hours (AUD-6549)
----------------------------- Delete below -----------------------------
Reminder: If your issue is a general question, start similar to "How do I..". If it is related to 3rd party development kits/libs, please discuss this on our community forum at https://esp32.com instead.
- [X] I have read the documentation Espressif Advanced Development Framework Guide and the issue is not addressed there.
- [X] I have updated my ADF and IDF branch (master or release) to the latest version and checked that the issue is present there.
- [X] I have searched the issue tracker for a similar issue and not found a similar issue.
Environment
- Audio development kit: none
- Audio kit version (for ESP32-LyraT/ESP32-LyraT-Mini/ESP32-S3-Korvo-2): [v1|v2|v3|v4]
- [Required] Module or chip used: [ESP32-WROOM-32E|ESP32-WROVER-E|ESP32-S2-WROVER|ESP32-S3-WROOM-1]
- [Required] IDF version (run
git describe --tagsin $IDF_PATH folder to find it): // v5.3.3-927-gbf79937908 - [Required] ADF version (run
git describe --tagsin $ADF_PATH folder to find it): // 7b987eeda2f917e28fb262a42565080dc2999fa2 - Build system: idf.py
- [Required] Running log: All logs from power-on to problem recurrence
- Compiler version (run
xtensa-esp32-elf-gcc --versionin your project folder to find it): // xtensa-esp-elf-gcc (crosstool-NG esp-13.2.0_20240530) 13.2.0 - Operating system: macOS/Linux
- (Windows only) Environment type: [MSYS2 mingw32|ESP Command Prompt|Plain Command Prompt|PowerShell]
- Using an IDE?: VSCode
- Power supply: USB
Problem Description
esp_rtc component to connect to a sip server(hosted by kamailio) is connecting fine, reconnects for several hours and then it crashes due to a heap corruption even if no other user tasks are executing.
The existing sdkconfig uses Comprehensive heap canaries.
Expected Behavior
There is no crash
Actual Behavior
Application crashes when freeing the memory after several hours.
Steps to Reproduce
- Setup a Sip Server - we used kamailio
- Use the provided application in the zip file
- Replace the placeholders in line 52-61 // If possible, attach a picture of your setup/wiring here. No special wiring is needed, just a ESP32 devkit V1
Code to Reproduce This Issue
(https://gist.github.com/vasilerares/1383e461945c22b635a59f7f4347c994)
Debug Logs
SIP/2.0 200 OK
Via: SIP/2.0/TLS server-ip:5061;branch=z9hG4bKx.94922.1.0
Contact: <sip:[email protected]:51479;transport=TLS>
From: <sip:[email protected]>;tag=uloc-686fbe63-1ed797-5202-cd6be8b4-68779f13-68945-172ca.1
To: <sip:[email protected]>;tag=-1475071133
Call-ID: ksrulka-3f8c592b-1ed78a-172ca.1
CSeq: 80 OPTIONS
Server: ESP32 SIP/2.0
Allow: ACK, INVITE, BYE, UPDATE, CANCEL, OPTIONS, INFO
Content-Length: 0
Accept: application/sdp, application/sdp
Allow: INVITE, ACK, CANCEL, BYE, UPDATE, NOTIFY, REFER, MESSAGE, OPTIONS, INFO, SUBSCRIBE
Supported: replaces, norefersub, extended-refer, timer, X-cisco-serviceuri
User-Agent: ESP32 SIP/2.0
Allow-Events: presence, kpml
[0m
[0;32mI (11910365) SIP: [2025-07-16/12:46:05]=======================>>[0m
[0;32mI (11910375) main: Running heap check[0m
[0;32mI (11910385) main: heap free: 203600[0m
[0;32mI (11916435) SIP: Sending keep-alive to server[0m
[0;33mW (11917475) SIP: CHANGE STATE FROM 2, TO 0, :func: sip_reconnect:385[0m
assert failed: heap_caps_free heap_caps_base.c:75 (heap != NULL && "free() target pointer is outside heap areas")
Backtrace: 0x40081b3a:0x3ffcd470 0x40089895:0x3ffcd490 0x40091425:0x3ffcd4b0 0x400823cb:0x3ffcd5d0 0x40091455:0x3ffcd5f0 0x400db3ad:0x3ffcd610 0x400dc243:0x3ffcd630 0x400dd234:0x3ffcd650 0x400dd746:0x3ffcd680 0x4008a3e1:0x3ffcd6e0
0x40081b3a: panic_abort at test/.devcontainer/esp/idf/components/esp_system/panic.c:478
0x40089895: esp_system_abort at test/.devcontainer/esp/idf/components/esp_system/port/esp_system_chip.c:87
0x40091425: __assert_func at test/.devcontainer/esp/idf/components/newlib/assert.c:80
0x400823cb: heap_caps_free at test/.devcontainer/esp/idf/components/heap/heap_caps_base.c:75 (discriminator 1)
0x40091455: free at test/.devcontainer/esp/idf/components/newlib/heap.c:39
0x400db3ad: media_lib_free at test/.devcontainer/esp/adf/components/esp-adf-libs/media_lib_sal/media_lib_os.c:86
0x400dc243: _sip_clean_parse at /builds/adf/esp-adf-libs-source/esp_media_protocols/esp_rtc/esp_rtc_core/esp_rtc_sip/esp_rtc_sip.c:319 (discriminator 1)
0x400dd234: sip_reconnect at /builds/adf/esp-adf-libs-source/esp_media_protocols/esp_rtc/esp_rtc_core/esp_rtc_sip/esp_rtc_sip.c:387
0x400dd746: sip_connect at /builds/adf/esp-adf-libs-source/esp_media_protocols/esp_rtc/esp_rtc_core/esp_rtc_sip/esp_rtc_sip.c:1815
(inlined by) _sip_task at /builds/adf/esp-adf-libs-source/esp_media_protocols/esp_rtc/esp_rtc_core/esp_rtc_sip/esp_rtc_sip.c:1997
0x4008a3e1: vPortTaskWrapper at test/.devcontainer/esp/idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:139
Other Items If Possible
- [test.zip] sdkconfig file (Attach the sdkconfig file from your project folder)
- [ ] elf file in the
buildfolder (Note this may contain all the code details and symbols of your project.) - [ ] coredump (This provides stacks of tasks.)
@shootao Issue is caused by not clearing the sip_handle->current_info pointer when the servers sends an options request that has some trailing \r\n.
The options_info and current_info pointers will point to the same address, sip_parse_clean inside _sip_uas_response_option will free the memory, but they will still be referenced in the current_info member. When it reaches the sip_reconnect it will try to clean the current_info, but since the memory it is already freed it will panic.
Can we have a library with the fix delivered?
@vasilerares Do you have gdbstub detail log use bt? heap show memory is not in heap, seems not double free issue, but maybe memory is corrupted. I see no extra operation near corrupt point only option send and timeout to try reconnect. You can manual set timeout to be small value to check whether easy to reproduce it.
I will also try to reproduce local to check whether can reproduce it or not.
@TempoTian I don't have the log. If it is needed, I can provide one. Initially, I had set the reconnect parameter to 30s, but unless the servers sends a OPTIONS request with trailing new lines, it does not reproduce. At most, it can make the issue appear much early after \r\nOPTIONS.
This is the state of sip_handle_t when the crash happens:
(gdb) print /x *(sip_handle_t)0x3ffc8630 $1 = {ctx = 0x3ffc85a0, event_handler = 0x400dbdf8, event_msg = {type = 0x1, data = 0x3ffbb1f0, data_len = 0x10}, transport_list = 0x3ffc8438, trans_session = 0x3ffc88b4, parser = {type = 0x0, flags = 0x0, state = 0x0, header_state = 0x0, index = 0x0, lenient_http_headers = 0x0, nread = 0x0, content_length = 0x0, http_major = 0x0, http_minor = 0x0, status_code = 0x0, method = 0x0, http_errno = 0x0, upgrade = 0x0, data = 0x0}, parser_settings = { on_message_begin = 0x0, on_url = 0x0, on_status = 0x0, on_header_field = 0x0, on_header_value = 0x0, on_headers_complete = 0x0, on_body = 0x0, on_message_complete = 0x0, on_chunk_header = 0x0, on_chunk_complete = 0x0}, task_event_group = 0x3ffcaf44, state = 0x2, is_data_available = 0x0, uas_flag = 0x0, uac_flag = 0x0, timeout_tick_ms = 0x0, timeout_value_ms = 0x0, ringing_tick_ms = 0x0, run = 0x1, running = 0x1, send_options = 0x0, tx_buffer = 0x3ffc8a38, rx_buffer = 0x3ffc9648, req_buffer = 0x3ffca314, auth_header = 0x3ffd5198, auth_nc = 0x1, cseq = 0x2, local_artp_port = 0x0, local_vrtp_port = 0x0, local_port = 0xf409, server_port = 0x13c5, extension = 0x0, server_addr = 0x3ffbcb8c, local_addr = 0x3ffcaf24, scheme = 0x3ffc8410, call_id = 0x3ffce148, ext = 0x0, via = 0x3ffca284, username = 0x3ffc87f4, password = 0x3ffc87c4, expires = 0xe10, on_call_info = 0x0, current_info = 0x3ffd510c, option_info = 0x3ffd510c, ringing_sec = 0x0, register_expires_sec = 0x3de, register_expires_tick = 0x10d4c3, remote_port = 0x0, rtp_audio = 0x0, rtp_video = 0x0, rtp_audio_run = 0x0, rtp_video_run = 0x0, transport_name = 0x3ffc8424, now_time = 0x3ffca258, acodec = 0x1, video_codec_info = {vcodec = 0x0, width = 0x0, height = 0x0, fps = 0x0, len = 0x0}, send_auth_type = 0x0, reg_retry = 0x0, custom_header_info = 0x3ffc89f8, network_timeout_ms = 0xbb8, use_public_addr = 0x0, keepalive = 0x1e, user_agent = 0x3ffc8a18, private_header = 0x0}.
current_info and option_info point to the same memory address. When a options request is received that starts with some trailing new lines is received, sip_clean_parse will be called for options_info, which will clear the memory pointed by current_info as well
I had comprehensive heap debugging enabled. When looking at the contents of current_info, the internal pointers where having 0xFEFEFEFE, meaning that they were previously freed. Then when the sip_reconnect was called, it tried to free current_info memory, but it the address was 0xFEFEFEFE, hence not in heap.
The message received in the right part, triggers this behavior:
.
If I did skip the part where it compares the method to 0x6(OPTIONS) and it did not reply to \r\nOPTIONS, the issue did not reproduce anymore.
@vasilerares Quite thanks for your info, let me have a double check.
I have reproduced this issue, root cause almost same as you analysis. Normal option request goto seperate function, while the bad one with "\r\n" in head can not enter it but also goes into other function which cause option not cleared and release unexpectly. YOu can try following lib to check whether fixed or not.
Thank you. I started the test and I will come back with feedback
On 31 Jul 2025, at 14:10, TempoTian @.***> wrote:
TempoTian left a comment (espressif/esp-adf#1501) https://github.com/espressif/esp-adf/issues/1501#issuecomment-3139505821 I have reproduced this issue, root cause almost same as you analysis. Normal option request goto seperate function, while the bad one with "\r\n" in head can not enter it but also goes into other function which cause option not cleared and release unexpectly. YOu can try following lib to check whether fixed or not.
esp_media_protocol_fix_option_crash.zip https://github.com/user-attachments/files/21529103/esp_media_protocol_fix_option_crash.zip — Reply to this email directly, view it on GitHub https://github.com/espressif/esp-adf/issues/1501#issuecomment-3139505821, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4K6VFIKR4SLN5KS72OQUVT3LH2RFAVCNFSM6AAAAACBU3VZBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCMZZGUYDKOBSGE. You are receiving this because you were mentioned.
@TempoTian the new library works fine, thank you.
One extra thing: can you mention in the documentation that those functions must return 0, if esp_rtc_bye was called? The timer task that is calling esp_rtp_send could call _send_audio after esp_rtc_bye was called and use the return value as the data len; this will cause an exception