ArduinoCore-renesas
ArduinoCore-renesas copied to clipboard
Network Related Crash on Long Running MQTT connections
I am seeing a crash on the Portenta C33 when using an MQTT client for a long duration (~15 minutes). The crash occurs within the delay call and occurs within the lwip_task of CNetIF.cpp. It certainly looks like we are seeing a memory management issue with the networking code.
We are using an SSL Client and certificates for our server authentication.
Here is the call stack for the crash:
_free_r@0x00060a0a (/_free_r.dbgasm:51)
__gnu_cxx::new_allocator<CMsg>::deallocate@0x0005ae5e (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/ext/new_allocator.h:125)
std::allocator_traits<std::allocator<CMsg> >::deallocate@0x0005ae5e (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/bits/alloc_traits.h:462)
std::_Deque_base<CMsg, std::allocator<CMsg> >::_M_deallocate_node@0x0005ae5e (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/bits/stl_deque.h:609)
std::_Deque_base<CMsg, std::allocator<CMsg> >::_M_destroy_nodes@0x0005ae5e (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/bits/stl_deque.h:743)
std::_Deque_base<CMsg, std::allocator<CMsg> >::~_Deque_base@0x0005ae74 (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/bits/stl_deque.h:665)
std::deque<CMsg, std::allocator<CMsg> >::~deque@0x0005b1c4 (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/bits/stl_deque.h:1045)
std::queue<CMsg, std::deque<CMsg, std::allocator<CMsg> > >::~queue@0x0005b1c4 (/Users/kylevisner/.platformio/packages/[email protected]/arm-none-eabi/include/c++/7.2.1/bits/stl_queue.h:96)
CEspCom::clearToEspQueue@0x0005b1c4 (/CEspCom::clearToEspQueue.dbgasm:109)
esp_host_there_are_data_to_be_tx@0x0005a6e4 (/esp_host_there_are_data_to_be_tx.dbgasm:12)
esp_host_spi_transaction@0x0005a6f8 (/esp_host_spi_transaction.dbgasm:5)
esp_host_perform_spi_communication@0x0005a73e (/esp_host_perform_spi_communication.dbgasm:7)
CEspControl::communicateWithEsp@0x00058ed8 (/CEspControl::communicateWithEsp.dbgasm:10)
CLwipIf::lwip_task@0x0004c0a8 (/CLwipIf::lwip_task.dbgasm:30)
CLwipIf::timer_cb@0x0004c10a (/CLwipIf::timer_cb.dbgasm:4)
r_gpt_call_callback@0x0002e174 (Unknown Source:1719)
<signal handler called>@0xffffffe9 (Unknown Source:0)
bsp_prv_software_delay_loop@0x0002f864 (/bsp_prv_software_delay_loop.dbgasm:1)
delay@0x00023c0a (/delay.dbgasm:4)
SSLClient::read@0x0001f628 (/SSLClient::read.dbgasm:8)
SSLClient::connected@0x0001f5b8 (/SSLClient::connected.dbgasm:10)
Thanks for your report, I got the same error while working on https://github.com/arduino/ArduinoCore-renesas/pull/234. In that PR I am trying to deal with all the network related issues, for the time being Ethernet and WiFi. I will try to address this issue with that PR.
Thanks, @andreagilardoni, Is there a workaround in the mean time to unblock us until that PR is done?
You can try using my PR and disable the timer inside the network stack.
- taking as reference the example here
- You need to comment this line
- You need to call
CLwipIf::getInstance().task()
inside theloop()
function - Design your application to avoid blocking calls as much as possible
Any kind of feedback on this work is appreciated.
@andreagilardoni was able to build with you PR, 2 items
- if you comment out line 30 of CNetIf.h, you'll get a build error.
- if you attempt to build it with
CLwipIf::getInstance().task()
, you'll get the following error:
Compilation error: 'class CLwipIf' has no member named 'task'
Well, after many weeks of wireless networking problems on the C33 platform, it looks like there are no fixes anytime soon. On our system we even "disable" networking after power-on (and brief use to access NTP), but the networking still causes a system hang after many hours of running (rare but fatal). It appears that there is something the class destructors are not doing correctly, since fragments of "WiFi" functionality are left operating after disconnection/shutdown. I think the advertisements for the Arduino C33 should NOT list networking, since it doesn't work correctly as yet.
Hello Have you find a way to fix this issue which is very annoying ?
Jérémy