STM32CubeWL icon indicating copy to clipboard operation
STM32CubeWL copied to clipboard

Application hangs because MBMUXIF_LoraSendCmd() command stuck

Open metaTinker opened this issue 1 year ago • 4 comments

Setup

  • STM32WL55JC1 embedded on a custom PCB with other peripherals
  • STM32CubeIDE
  • Lora gateway - Multitech MTCDT3AC model with a built-in network server, join server, packet forwarder and gateway.
  • STM32CubeWL f/w version 1.3.0

Application hangs because MBMUXIF_LoraSendCmd() command stuck on Sem_MbLoRaRespRcv sometimes

I have an application built around LoRaWAN_End_Node_DualCoreFreeRTOS example provided in the firmware. My application on CM4 sends telemetry roughly every 4-5 minutes. It will run well for a few days and suddenly the MBMUXIF_LoraSendCmd() gets stuck waiting on Sem_MbLoRaRespRcv. Reading more on how dual-core system works I figured that if a response is not received through the IPCC channels, the semaphore is never released. This is a potential pitfall for me because my application requires telemetry to be sent continuously at the 4/5 minute rate.

I cannot think of reasons why a Resp might not have been received by the CM4 core for any telemetry send Cmd.

How to reproduce the bug

At this time, I cannot pinpoint how to reproduce this bug. In my view it happens randomly at different times. Sometimes the system runs for a few days and the bug occurs or sometimes it happens right away.

Additional context

I have set up an rtos queue to not bombard the send API with messages. However, my queue gets full when this issue and no messages are sent.

** Code Snippet **

void MBMUXIF_LoraSendCmd(void)
{
  /* USER CODE BEGIN MBMUXIF_LoraSendCmd_1 */

  /* USER CODE END MBMUXIF_LoraSendCmd_1 */
  if (MBMUX_CommandSnd(FEAT_INFO_LORAWAN_ID) == 0)
  {
    osSemaphoreAcquire(Sem_MbLoRaRespRcv, osWaitForever);
  }
  else
  {
    Error_Handler();
  }
  /* USER CODE BEGIN MBMUXIF_LoraSendCmd_Last */

  /* USER CODE END MBMUXIF_LoraSendCmd_Last */
}

** Additional Info/questions ** I think by design this system waits forever on this semaphore. If at all a response is not heard back, can we have some retry mechanism or show it as a communication error callback/ retry mechanism of some kind?

metaTinker avatar Mar 12 '24 00:03 metaTinker

ST Internal Reference: 176222

RJMSTM avatar Mar 15 '24 11:03 RJMSTM

@RJMSTM any updates on this?

metaTinker avatar Apr 02 '24 17:04 metaTinker

@ALABSTM @RJMSTM is there any resolution to this?

metaTinker avatar May 09 '24 18:05 metaTinker

Hi @metaTinker,

We got the point. We will get back to you when we have updates to share. This may take some time. Thank you for your comprehension.

With regards,

ALABSTM avatar May 27 '24 15:05 ALABSTM