NimBLE-Arduino icon indicating copy to clipboard operation
NimBLE-Arduino copied to clipboard

Assistance identifying origin of Backtrace

Open GamingNJncos opened this issue 2 years ago • 3 comments

I'm running into a panic abort issue with BLE-Gamepad (0.5.1 and earlier) and running into a backtrace related to an assertion failure. Hoping I can get some pointers on potential next steps for debugging further. Any pointers are much appreciated. Thanks.

Backtrace

assert failed: static int NimBLECharacteristic::handleGapEvent(uint16_t, uint16_t, ble_gatt_access_ctxt*, void*) NimBLECharacteristic.cpp:277 (rc == 0)


Backtrace:0x40083aa5:0x3ffcff600x4009297d:0x3ffcff80 0x40097f89:0x3ffcffa0 0x401044fc:0x3ffd00d0 0x4010b38d:0x3ffd0150 0x4010b421:0x3ffd0180 0x40107343:0x3ffd01c0 0x401075de:0x3ffd0200 0x4010ac65:0x3ffd0220 0x40104412:0x3ffd0240 0x40104425:0x3ffd02a0 0x400d32a8:0x3ffd02c0 0x400d1e90:0x3ffd0310 0x400d235f:0x3ffd0340

  #0  0x40083aa5:0x3ffcff60 in panic_abort at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/panic.c:402
  #1  0x4009297d:0x3ffcff80 in esp_system_abort at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/esp_system/esp_system.c:128
  #2  0x40097f89:0x3ffcffa0 in __assert_func at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/newlib/assert.c:85
  #3  0x401044fc:0x3ffd00d0 in NimBLECharacteristic::handleGapEvent(unsigned short, unsigned short, ble_gatt_access_ctxt*, void*) at .pio/libdeps/featheresp32/NimBLE-Arduino/src/NimBLECharacteristic.cpp:315 (discriminator 1)
  #4  0x4010b38d:0x3ffd0150 in ble_gatts_val_access at .pio/libdeps/featheresp32/NimBLE-Arduino/src/nimble/nimble/host/src/ble_gatts.c:378
      (inlined by) ble_gatts_val_access at .pio/libdeps/featheresp32/NimBLE-Arduino/src/nimble/nimble/host/src/ble_gatts.c:343
  #5  0x4010b421:0x3ffd0180 in ble_gatts_chr_val_access at .pio/libdeps/featheresp32/NimBLE-Arduino/src/nimble/nimble/host/src/ble_gatts.c:424
  #6  0x40107343:0x3ffd01c0 in ble_att_svr_read at .pio/libdeps/featheresp32/NimBLE-Arduino/src/nimble/nimble/host/src/ble_att_svr.c:402
  #7  0x401075de:0x3ffd0200 in ble_att_svr_read_handle at .pio/libdeps/featheresp32/NimBLE-Arduino/src/nimble/nimble/host/src/ble_att_svr.c:477
  #8  0x4010ac65:0x3ffd0220 in ble_gattc_notify_custom at .pio/libdeps/featheresp32/NimBLE-Arduino/src/nimble/nimble/host/src/ble_gattc.c:4174
  #9  0x40104412:0x3ffd0240 in NimBLECharacteristic::notify(unsigned char const*, unsigned int, bool) at .pio/libdeps/featheresp32/NimBLE-Arduino/src/NimBLECharacteristic.cpp:511    
  #10 0x40104425:0x3ffd02a0 in NimBLECharacteristic::notify(bool) at .pio/libdeps/featheresp32/NimBLE-Arduino/src/NimBLECharacteristic.cpp:418
  #11 0x400d32a8:0x3ffd02c0 in BleGamepad::sendReport() at .pio/libdeps/featheresp32/ESP32-BLE-Gamepad/BleGamepad.cpp:774
      (inlined by) BleGamepad::sendReport() at .pio/libdeps/featheresp32/ESP32-BLE-Gamepad/BleGamepad.cpp:672
  #12 0x400d1e90:0x3ffd0310 in pollController() at src/main.cpp:589
  #13 0x400d235f:0x3ffd0340 in Task2code(void*) at src/main.cpp:752 (discriminator 1)

Looking at NimBLECharacteristic.cpp:277 from the panic, it looks like this could be related to packet size but I may be interpreting this incorrectly.

  uuid = ctxt->chr->uuid;
    if(ble_uuid_cmp(uuid, &pCharacteristic->getUUID().getNative()->u) == 0){
        switch(ctxt->op) {
            case BLE_GATT_ACCESS_OP_READ_CHR: {
                rc = ble_gap_conn_find(conn_handle, &desc);
                assert(rc == 0);

                 // If the packet header is only 8 bytes this is a follow up of a long read
                 // so we don't want to call the onRead() callback again.
                if(ctxt->om->om_pkthdr_len > 8 ||
                   pCharacteristic->m_value.size() <= (ble_att_mtu(desc.conn_handle) - 3)) {
                    pCharacteristic->m_pCallbacks->onRead(pCharacteristic);
                    pCharacteristic->m_pCallbacks->onRead(pCharacteristic, &desc);
                }

                ble_npl_hw_enter_critical();
                rc = os_mbuf_append(ctxt->om, pCharacteristic->m_value.data(), pCharacteristic->m_value.size());
                ble_npl_hw_exit_critical(0);
                return rc == 0 ? 0 : BLE_ATT_ERR_INSUFFICIENT_RES;
            }

It's possible I am creating a corruption on accident related to rtos stack size in tasks as I see this getting called. This is the task I'm calling for pinning updates for controller status to Core1 with a priority of 1 if that is helpful. BleGamepad::sendReport() in the backtrace is called from that task which should send the contents of the current packet upstream.

 #11 0x400d32a8:0x3ffd02c0 in BleGamepad::sendReport() at .pio/libdeps/featheresp32/ESP32-BLE-Gamepad/BleGamepad.cpp:774
      (inlined by) BleGamepad::sendReport() at .pio/libdeps/featheresp32/ESP32-BLE-Gamepad/BleGamepad.cpp:672
  #12 0x400d1e90:0x3ffd0310 in pollController() at src/main.cpp:589
  #13 0x400d235f:0x3ffd0340 in Task2code(void*) at src/main.cpp:752 (discriminator 1)

GamingNJncos avatar Sep 24 '22 08:09 GamingNJncos

The only thing I can think of that could cause that assert to trigger is if the connection closed just before processing the read request. Other than that, there could be corruption happening somewhere. This looks like an opportunity to cleanup some of these asserts, this one should be handled gracefully instead.

h2zero avatar Sep 24 '22 13:09 h2zero

Thanks for the reply. I think this is a timing issue.

As I figured I might be overrunning the stack I stuffed a number of uxTaskGetStackHighWaterMark() prints into to try to locate the source. After I added them (3 in total) throughout code, now suddenly the issue is gone. If I remove them, the assert failures immediately return.

Given that the only change is a couple prints, that suggests the added delay is just enough to get past this check in the library.

GamingNJncos avatar Sep 24 '22 18:09 GamingNJncos

i got the same issue

a BLE paket send every ~13ms (as fast as possible without delay between pakets) causing the same issue with a delay of 50ms it's ok.

I replaced: assert(rc == 0); with: if(rc != 0){ break; }

Now it is working without crash

ericlangel avatar Mar 20 '24 14:03 ericlangel