STM32CubeWB icon indicating copy to clipboard operation
STM32CubeWB copied to clipboard

CPU2 crash report during SHCI_C2_BLE_Init(...)

Open tim-nordell-nimbelink opened this issue 1 year ago • 3 comments

This is a cross post from the community forum, but I haven't gotten a response there to my bug report yet and since this is crashing within the CPU2 side of things an actual bug report here makes sense. It's incredibly difficult to debug the CPU2 side since the code is delivered as an encrypted blob.

Describe the set-up

  • Nucleo-WB55 or our custom board utilizing a STM32WB55
  • gcc-arm-none-eabi-10-2020-q4-major

Describe the bug Upon invoking SHCI_C2_BLE_Init(...) CPU2 enters a hard fault within the BLE HCI stacks

How To Reproduce I'm not quite sure what it is from our codebase that causes this yet. I could maybe provide a reduced pre-compiled binary to ST, but I cannot provide the full source code from our proprietary project. It's 100% reproducible with our code running on CPU1, and from what I can tell, the HSEM/IPCC/RCC peripherals are all in the same state as the example projects so I'm currently at a loss as to why this occurs. I've also copied all of the SHCI_C2_BLE_Init(...) parameters from some of the newer examples to no avail and validated with gdb that I had the exact same buffer contents being sent to CPU2 in the shared memory through the mailbox mechanism as the transparent mode example codebase.

Within our codebase, I can run v1.11.x through v1.15.0 of the BLE stack and successfully scan for BLE packets. v1.16.x through v1.19.x report a "security attack" upon SHCI_C2_BLE_Init(...) invocation, and v1.20.x has a hard fault. These variations in BLE stack behavior are all without changing the CPU1 firmware.

Here are the hard fault codes of the various v1.20.x HCI stacks as soon as I invoke SHCI_C2_BLE_Init(...) in our codebase:

v1.20.0 of stm32wb5x_BLE_HCILayer_fw.bin has a hard fault:

0x20030000 <TL_RefTable>:       0x1170fd0f      0x00003284      0x00002a33      0x2003f198

v1.20.0 of stm32wb5x_BLE_HCI_AdvScan_fw.bin has a hard fault:

0x20030000 <TL_RefTable>:       0x1170fd0f      0x00003160      0x00001f6f      0x2003ef50

v1.20.0 of stm32wb5x_BLE_HCILayer_extended_fw.bin has a hard fault:

0x20030000 <TL_RefTable>:       0x1170fd0f      0x00003390      0x00002b3f      0x2003f6f8

Please let me know if the PC, SP, and LR inside CPU2 works is sufficient to get an initial fault analysis, or if I need to prepare a minimal pre-compiled binary for CPU1 exhibiting this. I'm still attempting to narrow down what's different between the examples and our codebase - we had integrated the BLE portions of the v1.11.x STM32WBCube codebase quite a while ago but are trying to migrate to the newer version of the BLE stack so we can address the errata around necessitating calling the relatively new SHCI_C2_SetSystemClock(...) command.

tim-nordell-nimbelink avatar Aug 23 '24 22:08 tim-nordell-nimbelink

ST Internal Reference: 189569

RJMSTM avatar Aug 27 '24 14:08 RJMSTM

I spent some time yesterday/today and figured out how to dump out the code for CPU2 for these encrypted firmware binaries so that I could properly debug this.

This is what I found:

  • v1.16.0:

    • This firmware started to always validate p_ble_table->phci_acl_data_buffer as pointing to non-secure SRAM. Previously, this validation was dependent on if SHCI_C2_Ble_Init_Cmd_Param_t's Options byte had bit 0 set to 1. I had this bit set to 0 in our code, so the value of p_ble_table->phci_acl_data_buffer was ignored prior to v1.16.0.

      This is sort of mentioned in the release notes of v1.16.0 as:

      ID 136949 : For ACL_DATA activation, the BLE options flag has to be configured with SHCI_C2_BLE_INIT_OPTIONS_LL_ONLY with Full and Full extended stack binaries and no special BLE options flag required in “HCI_ONLY” (ie Light, HCI layer, ext HCI layer binaries)

      but the way it's read makes you think it's required only if you want to use ACL_DATA, and it doesn't mention that this is required with the BLE_HCI_AdvScan_fw now too.

    • (The validation in v1.15.0 of stm32wb5x_BLE_HCI_AdvScan_fw.bin was skipped at instruction offset 0xca8 and jumped to 0xcb8, which skipped the call to the validation function for this parameter.)

  • v1.20.0:

    • This firmware invokes a software breakpoint when it has a security fault just after the security attack key is set into SRAM2A_BASE. The software breakpoint in turn caused a hard fault, which ultimately resulted in SRAM2A_BASE being set twice in a row (once with a security fault, and then once with a hard fault).

I see 2 bugs as a result:

  • Hard fault instead of security error inside the CPU2 firmware in v1.20.0

  • The STM32CubeWB's hci_init(...) call does not initialize phci_acl_data_buffer, and instead, whatever value that was on the stack at the time of invocation ends up in this pointer. This leads to semi-unpredictable behavior in the initialization routine CPU2 side given the change in v1.16.0 since it's validating variables that inherit whatever was on the stack at the time:

    void hci_register_io_bus(tHciIO* fops)
    {
      /* Register IO bus services */
      fops->Init    = TL_BLE_Init;
      fops->Send    = TL_BLE_SendCmd;
    
      return;
    }
    
    void hci_init(void(* UserEvtRx)(void* pData), void* pConf)
    {
      StatusNotCallBackFunction = ((HCI_TL_HciInitConf_t *)pConf)->StatusNotCallBack;
      hciContext.UserEvtRx = UserEvtRx;
    
      hci_register_io_bus (&hciContext.io);
    
      TlInit((TL_CmdPacket_t *)(((HCI_TL_HciInitConf_t *)pConf)->p_cmdbuffer));
    
      return;
    }
    
    typedef struct
    {
      void (* IoBusEvtCallBack) ( TL_EvtPacket_t *phcievt );
      void (* IoBusAclDataTxAck) ( void );
      uint8_t *p_cmdbuffer;
      uint8_t *p_AclDataBuffer;
    } TL_BLE_InitConf_t;
    
    static void TlInit( TL_CmdPacket_t * p_cmdbuffer )
    {
      TL_BLE_InitConf_t Conf;
      ...
    
      /* Initialize low level driver */
      if (hciContext.io.Init)
      {
    
        Conf.p_cmdbuffer = (uint8_t *)p_cmdbuffer;
        /* Several values in Conf are left uninitialized, including phci_acl_data_buffer */
        Conf.IoBusEvtCallBack = TlEvtReceived;
        hciContext.io.Init(&Conf);
      }
    
      return;
    }
    

    I'd suggest at least initializing these fields to 0. This could simply by done by changing TL_BLE_InitConf_t Conf -> TL_BLE_InitConf_t Conf = {}.

    The wireless application manual states this of TL_BLE_Init(...), indicating that the current behavior of hci_init(...) using the stack values to fill in the remaining values is not adhering to the spec given:

    When not in HCI only mode, both p_AclDataBuffer and IoBusAclDataTxAck are not used and must be set to 0.

So follow up questions beyond the 3 bugs noted above:

  • Is it expected that the validation changed for phci_acl_data_buffer?
  • Is it expected to be able to utilize the hci_* APIs from ble_hci_le.h to interface with the HCI firmware stack variants, especially the beacon/scan only variant? I don't need passthrough, and I don't need BLE ACL support since I'm only doing BLE advertising packet scans, so I'd have to allocate a completely unused buffer for doing this with v1.16.x and up given what I discovered here.
  • Does the stm32wb5x_BLE_HCI_AdvScan_fw.bin variant even support HCI ACL packets, especially since HCI_LE_CREATE_CONNECTION isn't supported in this stack? Given this, I wouldn't expect this variant of the stack to require phci_aci_data_buffer as being allocated.

tim-nordell-nimbelink avatar Aug 28 '24 21:08 tim-nordell-nimbelink

I got some follow-up answers within the community forum that:

  1. The stm32wb5x_BLE_HCI_AdvScan_fw.bin variant does not support HCI ACL packets, but requires the ACL buffer allocation as the firmware checks "is it a HCI variant?".
  2. The validation change for phci_cal_data_buffer is expected.

tim-nordell-nimbelink avatar Sep 17 '24 16:09 tim-nordell-nimbelink

Hello,

Point 1. Issue has been fixed for STM32CubeWB v1.21.0 Point 2. Behavior expected and explained to the customer. check this link please

Regards,

RJMSTM avatar Mar 03 '25 13:03 RJMSTM