uavcan: prevent system crash on stm32h7 when restarting uavcan node
Solved Problem
When restarting uavcan on STM32, for example by running uavcan stop uavcan start, the system crashes.
The issue was introduced with https://github.com/PX4/PX4-Autopilot/pull/20746, and affects 1.14 stable release and current main.
The issue occurs because the can driver is re-initialized when the node is restarted. The previous author noted in a comment that it shouldn't be re-initialized, but accidentally put the init call in the node initialization section of UavcanNode::Run. This made it run once per node restart instead of once per system boot.
Solution
Move driver initialization to UavcanNode::start, where the CanInitHelper is initialized, and where there is already a check if the can driver has been previously initialized.
Now we can also avoid fetching the bitrate parameter twice.
Changelog Entry
For release notes:
Bugfix: Prevent system crash on stm32 when restarting uavcan node
Test coverage
- Tested on a CubeOrange flight controller
Context
Crash dump from 1.14 when stopping and starting uavcan: fault_2000_01_01_00_02_28.log
Oh sorry to hear this my PR caused this bug
But actually there's a good reason to init the CAN driver UavcanNode::Run this is because UavcanNode::start runs in a different thread and you want to initialize the CAN device in the same thread you're reading it from. If this PR goes in it will probably break the FMUK66, UCANS32K1, V6X-RT targets.
Wouldn't it better to actually de-initialize the CAN driver in, avoiding the crash? https://github.com/PX4/PX4-Autopilot/blob/9f0f3b9525085bf9147b7cdacf39fef5cb098235/src/drivers/uavcan/uavcan_main.cpp#L993
Because technically after uavcan start uavcan stop the STM32 can driver is still running.
@PetervdPerk-NXP Thanks for the quick response!
Yeah, sorry, I was a bit quick and didn't look close enough at the entire commit history and PR to see that I just moved the initialization back to where it was, and that there was a good reason for the change in the first place :sweat_smile:
According to the comment in the source code, there is no way to deinit the driver on STM32. We could maybe add a static flag that skips re-initialization on stm32?
Or I guess the best would be to actually include a deinit method in the CanInitHelper and perform proper deinitialization.
@PetervdPerk-NXP I investigated a bit further why the code actually crashed by running init twice, and it is because the init bit is not properly set here, so it times out: https://github.com/PX4/PX4-Autopilot/blob/926e7878afc6ce804b0b4c097fbf94d405841f42/src/drivers/uavcan/uavcan_drivers/stm32h7/driver/src/uc_stm32h7_can.cpp#L691
Edit: It seems like just letting initOnce run every time and not just once, the driver handles being re-initialized just fine...
It is because initOnce resets the FDCan driver. Just always performing a reset with RCC_APB1RSTR_CAN1RST also solves the issue on CubeOrange (stm32h7). The same probably applies to regular stm32 driver. Unsure about any side effects running this multiple times could cause. This is beyond my area of expertise :sweat_smile:
Manually stopping and starting uavcan is kind of an edge case anyway, might not really be that important to do anything here.
@davids5 any idea about this?