STM32CubeWL
STM32CubeWL copied to clipboard
Radio IRQ during HAL_SUBGHZ_ExecSetCmd causes infinite Radio IRQ handler re-entering
Describe the set-up
- Custom board populated with the STM32WLE5.
- Keil MDK-ARM V5.36.0.0.
- 8 devices running in LoRaWAN class C.
Describe the bug (skip if none)
If a Radio IRQ occurs during HAL_SUBGHZ_ExecSetCmd, it will cause an infinite re-entering of the Radio IRQ handler.
In LoRaWAN class C, the transceiver operates most of the time in Rx. When the transceiver is in Rx and the application requests an uplink, the LoRaWAN stack will serve the uplink request by first stopping the Rx mode and setting the transceiver in standby mode by calling RadioStandby in RadioSetTxConfig. RadioStandby will call SUBGRF_SetStandby and SUBGRF_SetStandby will call HAL_SUBGHZ_ExecSetCmd. HAL_SUBGHZ_ExecSetCmd will lock the subghz handle by __HAL_LOCK(hsubghz). If a Radio IRQ occurs during HAL_SUBGHZ_ExecSetCmd, e.g. triggered by reception of a packet, then the SUBGHZ_Radio_IRQHandler will be called. The SUBGHZ_Radio_IRQHandler will call HAL_SUBGHZ_IRQHandler. The HAL_SUBGHZ_IRQHandler will try to retrieve the interrupt source from the transceiver by calling HAL_SUBGHZ_ExecGetCmd, but HAL_SUBGHZ_ExecGetCmd will return immediately since it cannot lock the subghz handle. The subghz handle has already been locked by HAL_SUBGHZ_ExecSetCmd. At the end of HAL_SUBGHZ_IRQHandler, the handler will try to clear the interrupt request by calling HAL_SUBGHZ_ExecSetCmd, but HAL_SUBGHZ_ExecSetCmd will also return immediately since it cannot lock the subghz handle. The result is that the radio IRQ won't be cleared, causing a re-entering of SUBGHZ_Radio_IRQHandler immediately after exiting it.
Screenshots
Best regards Roman Jasmann
ST Internal Reference: 134090
Hi @RomanJasmann,
For a better analysis of your issue, would you please give us more details about the version of STM32CubeWL package you are using ?
With regards,
Hi,
I'm using STM32CubeWL V1.0.0.
Best regards Roman Jasmann
Hi @RomanJasmann,
Would you please try with the latest version v1.2.0 of the STM32CubeWL package available on the ST.com.
With regards,
@ASELSTM
Hi, the implementation of __HAL_LOCK in v1.2.0 is the same as in v1.0.0. So the problem will also occur with v1.2.0. The root cause of the problem is an incorrect implementation of __HAL_LOCK. See this discussion:
https://community.st.com/s/question/0D50X0000C5Tns8SQC/bug-stm32-hal-driver-lock-mechanism-is-not-interrupt-safe
Here is some additional description:
- When the device is running in class C mode, it opens a continous Rx window with the following PHY layer downlink configuration (note the inverted IQ polarization):
- As soon as the application of the device requests the LoRaWAN stack to send an uplink, the stack will check the request and serve the request by switching from Rx to Tx in RegionEU868TxConfig. In RegionEU868TxConfig, the stack will primary compute the Tx PHY layer configuration (2.1), set the radio frequency to the next Tx channel frequency (2.2), and apply the computed PHY layer config to the radio (2.3). Note that the LoRaWAN stack does not set the radio in standby mode before setting the radio frequency to the next Tx uplink channel frequency (2.2) and applying the uplink Tx PHY layer configuration (2.3). This means that the radio will keep a downlink PHY layer configuration and stay in Rx until Radio.SetTxConfig (or more precisely, the SPI transaction to set the radio in Tx mode associated with Radio.SetTxConfig) has not finished.
Note also that Radio.Rx enables all radio interrupts and not just IRQ_RX_DONE and IRQ_RX_TX_TIMEOUT whereas Radio.Send only enables IRQ_TX_DONE and IRQ_RX_TX_TIMEOUT. So even an IRQ_PREAMBLE_DETECTED could be requested from the radio.
- Radio.SetTxConfig will lead to a call of HAL_SUBGHZ_ExecSetCmd. HAL_SUBGHZ_ExecSetCmd will call __HAL_LOCK(hsubghz) and thereby allocate the hsubghz handle. Remember that at this point the radio is still in Rx mode. If a radio interrupt occurs now between __HAL_LOCK(hsubghz) (3.1) and the point where the MCU issues the radio SPI command by driving NSS high (3.2), then it will create a deadlock and cause an infinite radio IRQ handler re-entering.
- The reason for the deadlock is that SUBGHZ_Radio_IRQHandler will call HAL_SUBGHZ_IRQHandler. The HAL_SUBGHZ_IRQHandler will try to retrieve the interrupt source from the radio by calling HAL_SUBGHZ_ExecGetCmd (4.1), but HAL_SUBGHZ_ExecGetCmd will return HAL_BUSY since the subghz handle has already been allocated by __HAL_LOCK in HAL_SUBGHZ_ExecSetCmd (3.1).
At the end of HAL_SUBGHZ_IRQHandler, the handler will try to clear the interrupt request by calling HAL_SUBGHZ_ExecSetCmd (4.2), but HAL_SUBGHZ_ExecSetCmd will also return HAL_BUSY. The result is that the radio IRQ won't be cleared, causing a re-entering of SUBGHZ_Radio_IRQHandler immediately after exiting it.
Best regards Roman Jasmann
Hi Roman,
Thank you for providing such a clear explanation. I have encountered what I think is the same issue and it was very helpful being able to follow your logic through.
Have you found a reliable fix for this? My device is operating in Class A but still hits a fault at exactly the same point, as the HAL_SUBGHZ_ExecSetCmd is run to set the SPI NSS high. I'm struggling to find a good way around it!
Annoyingly this wasn't occuring when using STMCUBEIDE v1.10.0, it's only after updating I'm having the issue but I can't see any clear difference in generated files from before and after the update. Thank you for your time.
All the best, Alex
Hi @RomanJasmann,
Would you please try to test with the latest available version of STM32CubeWL and check whether you are still having the same issue or not.
With regards,
Hi, We are facing the same issue when using STM32CubeWL 1.3.0. It happens frequently when using ABP mode. @RomanJasmann did you resolve the problem.
Kind regards.
Hi,
In Middlewares\Third_Party\SubGHz_Phy\stm32_radio_driver\ radio_driver.c, the HAL_SUBGHZ is always called within Critical Section to forbid IRQs within the Call e.g:
void SUBGRF_WriteRegisters( uint16_t address, uint8_t *buffer, uint16_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteRegisters( &hsubghz, address, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_ReadRegisters( uint16_t address, uint8_t *buffer, uint16_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadRegisters( &hsubghz, address, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_WriteBuffer( uint8_t offset, uint8_t *buffer, uint8_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteBuffer( &hsubghz, offset, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_ReadBuffer( uint8_t offset, uint8_t *buffer, uint8_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadBuffer( &hsubghz, offset, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_WriteCommand( SUBGHZ_RadioSetCmd_t Command, uint8_t *pBuffer,
uint16_t Size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ExecSetCmd( &hsubghz, Command, pBuffer, Size );
CRITICAL_SECTION_END();
}
void SUBGRF_ReadCommand( SUBGHZ_RadioGetCmd_t Command, uint8_t *pBuffer,
uint16_t Size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ExecGetCmd( &hsubghz, Command, pBuffer, Size );
CRITICAL_SECTION_END();
}
@YoannBo
You are right. Thank you for the hint. They have changed the implementation in v1.3.0. Compare
STM32CubeWL V1.0.0:
https://github.com/STMicroelectronics/STM32CubeWL/blob/v1.0.0/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c
#define SUBGRF_WriteCommand( x, y, z ) HAL_SUBGHZ_ExecSetCmd( &hsubghz, (x), (y), (z) )
#define SUBGRF_ReadCommand( x, y, z ) HAL_SUBGHZ_ExecGetCmd( &hsubghz, (x), (y), (z) )
void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
}
uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
return data;
}
void SUBGRF_WriteRegisters( uint16_t address, uint8_t *buffer, uint16_t size )
{
HAL_SUBGHZ_WriteRegisters( &hsubghz, address, buffer, size );
}
void SUBGRF_ReadRegisters( uint16_t address, uint8_t *buffer, uint16_t size )
{
HAL_SUBGHZ_ReadRegisters( &hsubghz, address, buffer, size );
}
void SUBGRF_WriteBuffer( uint8_t offset, uint8_t *buffer, uint8_t size )
{
HAL_SUBGHZ_WriteBuffer( &hsubghz, offset, buffer, size );
}
void SUBGRF_ReadBuffer( uint8_t offset, uint8_t *buffer, uint8_t size )
{
HAL_SUBGHZ_ReadBuffer( &hsubghz, offset, buffer, size );
}
STM32CubeWL V1.3.0:
https://github.com/STMicroelectronics/STM32CubeWL/blob/v1.3.0/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c
void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
}
uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
return data;
}
void SUBGRF_WriteRegisters( uint16_t address, uint8_t *buffer, uint16_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteRegisters( &hsubghz, address, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_ReadRegisters( uint16_t address, uint8_t *buffer, uint16_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadRegisters( &hsubghz, address, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_WriteBuffer( uint8_t offset, uint8_t *buffer, uint8_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteBuffer( &hsubghz, offset, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_ReadBuffer( uint8_t offset, uint8_t *buffer, uint8_t size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadBuffer( &hsubghz, offset, buffer, size );
CRITICAL_SECTION_END();
}
void SUBGRF_WriteCommand( SUBGHZ_RadioSetCmd_t Command, uint8_t *pBuffer,
uint16_t Size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ExecSetCmd( &hsubghz, Command, pBuffer, Size );
CRITICAL_SECTION_END();
}
void SUBGRF_ReadCommand( SUBGHZ_RadioGetCmd_t Command, uint8_t *pBuffer,
uint16_t Size )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ExecGetCmd( &hsubghz, Command, pBuffer, Size );
CRITICAL_SECTION_END();
}
Of course, the critical sections will eliminate the problem.
@ASELSTM You can close the issue.
@ASELSTM
HAL_SUBGHZ_WriteRegisters and HAL_SUBGHZ_ReadRegisters in SUBGRF_WriteRegister and SUBGRF_ReadRegister must be protected by a critical section too:
https://github.com/STMicroelectronics/STM32CubeWL/blob/main/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c
void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
}
uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
return data;
}