Home icon indicating copy to clipboard operation
Home copied to clipboard

Starting, Stopping and Starting BLE Advertisement cause System Panic

Open alberk8 opened this issue 3 years ago • 5 comments

Library/API/IoT binding

nanoFramework.Device.Bluetooth

Visual Studio version

VS2022 v 17.2.5

.NET nanoFramework extension version

2022.2.0.23

Target name(s)

ESP32_BLE_REV0

Firmware version

1.8.0.362

Device capabilities

No response

Description

System Panics after Starting-> Stopping -> Starting BLE Advertisment.

How to reproduce

  GattServiceProviderResult result = GattServiceProvider.Create(serviceUuid);
            if (result.Error != BluetoothError.Success)
            {
                return;
            }

            GattServiceProvider serviceProvider = result.ServiceProvider;

            // Get created Primary service from provider
            GattLocalService service = serviceProvider.Service;

   
            DataWriter sw = new DataWriter();
            sw.WriteString("This is Bluetooth sample 3");


            GattLocalCharacteristicResult characteristicResult = service.CreateCharacteristic(readStaticCharUuid,
                 new GattLocalCharacteristicParameters()
                 {
                     CharacteristicProperties = GattCharacteristicProperties.Read,
                     UserDescription = "My Static Characteristic",
                     //ReadProtectionLevel = GattProtectionLevel.EncryptionRequired,

                     StaticValue = sw.DetachBuffer()
                 });



            if (characteristicResult.Error != BluetoothError.Success)
            {
                // An error occurred.
                return;
            }



            DeviceInformationServiceService DifService = new DeviceInformationServiceService(
                    serviceProvider,
                    "BLE Sensor",
                    "Model-1",
                    "989898", // no serial number
                    "v1.0",
                    SystemInfo.Version.ToString(),
                    "");

           BatteryService BatService = new BatteryService(serviceProvider);

            // Update the Battery service the current battery level regularly. In this case 94%
            BatService.BatteryLevel = 94;


            Debug.WriteLine("Before Advertise");
            serviceProvider.StartAdvertising(new GattServiceProviderAdvertisingParameters()
            {
                DeviceName = "BLE Sensor",
                IsConnectable = true,
                IsDiscoverable = true,

            });
            for (int i = 1; i < 5; i++)
            {
                Debug.WriteLine("Loop: " + i);
                Thread.Sleep(2_000);
                if (serviceProvider.AdvertisementStatus == GattServiceProviderAdvertisementStatus.Stopped)
                {
                    Debug.WriteLine("Starting");
                    serviceProvider.StartAdvertising();
                    
                }
                else
                {
                    Debug.WriteLine("Stopping");
                    serviceProvider.StopAdvertising();
                }
            }

Expected behaviour

Do not crash the system when stopping and starting BLE Advertisment.

Screenshots

No response

Sample project or code

No response

Aditional information

No response

alberk8 avatar Jul 24 '22 07:07 alberk8

More information. It looks like the vTaskDelete never return within the nimble_port_freertos_deinit() call.

alberk8 avatar Jul 25 '22 01:07 alberk8

More Info, Tracking this down and it looks like the vTaskDelete does not actually delete the memory allocatedby the xTaskCreatePinnedToCore (in fact it is deferred the removal to idle task) and subsequent call to the xTaskCreatePinnedToCore it ran into tlfs memory allocation. This is more like the core of IDF. 😭

0x40098037: insert_free_block at /sources/esp-idf/components/heap/heap_tlsf.c:242
0x40098037: block_insert at /sources/esp-idf/components/heap/heap_tlsf.c:269
0x40098037: block_trim_free at /sources/esp-idf/components/heap/heap_tlsf.c:383
0x40098037: block_prepare_used at /sources/esp-idf/components/heap/heap_tlsf.c:460
0x40098037: tlsf_malloc at /sources/esp-idf/components/heap/heap_tlsf.c:758
0x4009857a: multi_heap_malloc_impl at /sources/esp-idf/components/heap/multi_heap.c:197
0x400846ae: heap_caps_malloc_base at /sources/esp-idf/components/heap/heap_caps.c:147
0x400846c1: heap_caps_malloc at /sources/esp-idf/components/heap/heap_caps.c:167
0x400947b5: xTaskCreatePinnedToCore at /sources/esp-idf/components/freertos/tasks.c:871
0x40127255: nimble_port_freertos_init at /sources/esp-idf/components/bt/host/nimble/nimble/porting/npl/freertos/src/nimble_port_freertos.c:65
0x400fe52a: device_ble_start(ble_services_context&) at /workspaces/nf-interpreter/targets/ESP32/_nanoCLR/nanoFramework.Device.Bluetooth/esp32_nimble.cpp:579
0x400fdce2: Library_sys_dev_ble_native_nanoFramework_Device_Bluetooth_GenericAttributeProfile_GattServiceProvider::NativeStartAdvertising___BOOLEAN(CLR_RT_StackFrame&) at /workspaces/nf-interpreter/targets/ESP32/_nanoCLR/nanoFramework.Device.Bluetooth/sys_dev_ble_native_nanoFramework_Device_Bluetooth_GenericAttributeProfile_GattServiceProvider.cpp:776
0x400e239e: CLR_RT_Thread::Execute_Inner() at /workspaces/nf-interpreter/src/CLR/Core/Interpreter.cpp:840
0x400e20f9: CLR_RT_Thread::Execute() at /workspaces/nf-interpreter/src/CLR/Core/Interpreter.cpp:642
0x400dd08d: CLR_RT_ExecutionEngine::ScheduleThreads(int) at /workspaces/nf-interpreter/src/CLR/Core/Execution.cpp:1205
0x400dc815: CLR_RT_ExecutionEngine::Execute(wchar_t*, int) at /workspaces/nf-interpreter/src/CLR/Core/Execution.cpp:632
0x400f4cbd: ClrStartup at /workspaces/nf-interpreter/src/CLR/Startup/CLRStartup.cpp:405
0x40116e66: CLRStartupThread at /workspaces/nf-interpreter/targets/ESP32/_nanoCLR/CLR_Startup_Thread.c:17
0x401150d9: main_task at /workspaces/nf-interpreter/targets/ESP32/_IDF/esp32/app_main.c:37

alberk8 avatar Aug 01 '22 02:08 alberk8

The problem seems to be a heap corruption. Seen when a new malloc is done. Will need to track done and fix.

Probably overwritting over the end of a memory block.

AdrianSoundy avatar Aug 09 '22 11:08 AdrianSoundy

The initial problem was a typo causing the freeing up of same bit of memory twice. This uncovered another problem when quickly starting and stopping server.

AdrianSoundy avatar Aug 09 '22 12:08 AdrianSoundy

Hi @AdrianSoundy,

Where is that typo? Is it in the FreeContext? I have also raise an issue in ESP-IDF github and one of the recommendation is to put a delay.

Can we make the managed Bluetooth class disposable?. I run out of managed memory when running in a loop (Create -> StartAdvertising -> StopAdvertising);

alberk8 avatar Aug 10 '22 00:08 alberk8

The initial error which caused panic is in FreeContext, wasn't freeing serviceUuid but something else.

        ble_context *context = &srvContext.bleSrvContexts[i];
        if (context->serviceUuid)
        {
            platform_free(context->serviceUuid);  <<<<< 
            context->serviceUuid = NULL;
        }

The next problem was that it wasn't waiting for BLE stack to be started. So when you quickly start and stop it causes problem trying to stop something that hasn't actually started yet.

Both these problems are now fixed in latest code (to be committed)

The next issue is the managed objects are not being released which i am currently working on. At the moment If you only start and stop and don't try to create a new GattServiceProvider then managed memory stays the same. I may have to make everything disposable to make sure memory is freed up. Still trying to investigate why it not being released.

AdrianSoundy avatar Aug 10 '22 21:08 AdrianSoundy

Found why we were getting out of memory issue. A static event listener was holding on to object references. Now will start / stop as per example code and managed memory doesn't decrease after initial decrease.

There is still a panic which occurs after about 25 loops. Doesn't make sense so will leave until after the Center/Client release. At least it's usable now.

AdrianSoundy avatar Aug 11 '22 10:08 AdrianSoundy

Found why we were getting out of memory issue. A static event listener was holding on to object references. Now will start / stop as per example code and managed memory doesn't decrease after initial decrease.

There is still a panic which occurs after about 25 loops. Doesn't make sense so will leave until after the Center/Client release. At least it's usable now.

It is fine to wait for the Client/Central to be released first. I will be aggressively testing both Central and Peripheral scenario.

One other way that I have found that it could work was to call ble_gap_adv_stop(); for the StopAdvertising() (not disposing the whole BLE stack) but I have to check that StartAdvertising only initialize once and subsequently only call esp32_ble_start_advertise(&bleContext); This has survive thousands of Start -> Stop loop.

One suggestion, is to have start and stop advertisement without disposing the whole BLE stack and only on calling Dispose in manage code do we dispose of the BLE stack. If we can also add the following it would be fantastic. When stopping the advertisement, kick out/terminate connected session and prevent direct connection (connecting from central without scanning).

alberk8 avatar Aug 12 '22 00:08 alberk8

@AdrianSoundy @alberk8 , was this fixed with the latest bluetooth changes?

networkfusion avatar Aug 23 '22 17:08 networkfusion

This PR fixed the issue. Closing now.

alberk8 avatar Aug 24 '22 00:08 alberk8