bleak icon indicating copy to clipboard operation
bleak copied to clipboard

Is there a leak in corebluetooth or am I using it wrong?

Open Jc2k opened this issue 3 years ago • 2 comments

  • bleak version: 0.14.2
  • Python version: 3.9
  • Operating System: macOS 10.15.7
  • BlueZ version (bluetoothctl -v) in case of Linux:

Description

I am working on a HomeKit BLE client for Home Assistant to support things like Eve or Nanoleaf "home automation" devices. Polling is working well, and so are what Apple calls "disconnected" events (incrementing a counter in advertising data to trigger a poll). To finish this off i'm looking to add support for "connected" events (start_notify).

For context, HomeKit devices don't do pairing, they implement their own encryption scheme on top of unpaired GATT write/read. HomeKit devices tend to close the connection after less than 30s-ish (haven't timed this). They stay connected in case e.g. you were dragging a brightness slider on your phone and wanted fast successive updates. I don't know if thats normal for BLE or not, but it means that short of polling every 10s I need to reconnect to the device a lot.

I'm at a bit of a loss with I guess what you would call lifetime and ownership issues.

  • Is it expected that a BleakClient is one-shot, that if the BLE device becomes disconnected I need a new one?
  • Or, Is it expected that I can reuse the same BleakClient, and "reconnect" if a BLE device drops the connection?
    • If i reuse it, do i have to re-register the notify callbacks?
  • How do I call stop_notify if the connection was closed by the device?

I've guessed and struggled to find an ownershipe/lifetime model that works for all my test cases. The closest seems to be that BleakClient can only be used once, but there is a bug that means they are leaking right now. So given the frequency of disconnects, if i poll one device once a minute then i'll leak 1440 BleakClient instances a day.

What I Did

This is a reproduction of the things I tried. I don't fully remember them all, I think there was also a situation where I had no errors but no notifications which I can't reproduce.

I believe this simple test case is working, but I don't know if its considered valid (as it reuses a BleakClient).

Note that characteristic 71 is just a random one that happened to have indicate in its properties.

from bleak import BleakClient
import asyncio

async def main():
    def _disconnected(*args, **kwargs):
        print("DISCONNECT", args, kwargs)

    client = BleakClient("E8EA177F-0419-44A9-B1DD-D96F348B1184")
    client.set_disconnected_callback(_disconnected)

    def _callback(*args, **kargs):
        print("NOTIFY", args, kwargs)

    async with client:
        char = client.services.characteristics[71]
        await client.start_notify(char.handle, _callback)
        await client.stop_notify(char.handle)

    async with client:
        await client.start_notify(char.handle, _callback)
        await client.stop_notify(char.handle)

asyncio.run(main())

The problem I face is when the device disconnects services are cleared out, so this now fails:

from bleak import BleakClient
import asyncio

async def main():
    def _disconnected(*args, **kwargs):
        print("DISCONNECT", args, kwargs)

    client = BleakClient("E8EA177F-0419-44A9-B1DD-D96F348B1184")
    client.set_disconnected_callback(_disconnected)

    def _callback(*args, **kargs):
        print("NOTIFY", args, kwargs)

    async with client:
        char = client.services.characteristics[71]
        await client.start_notify(char.handle, _callback)
        await asyncio.sleep(40)
        await client.stop_notify(char.handle)

    async with client:
        await client.start_notify(char.handle, _callback)
        await client.stop_notify(char.handle)

asyncio.run(main())
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x105e4d6d0>>,) {}
Traceback (most recent call last):
  File "/Users/john/Projects/home-assistant/bleaktest.py", line 24, in <module>
    asyncio.run(main())
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/Users/john/Projects/home-assistant/bleaktest.py", line 18, in main
    await client.stop_notify(char.handle)
  File "/Users/john/Projects/home-assistant/venv/lib/python3.9/site-packages/bleak/backends/corebluetooth/client.py", line 404, in stop_notify
    raise BleakError("Characteristic {} not found!".format(char_specifier))
bleak.exc.BleakError: Characteristic 71 not found!

So I thought maybe I can hold onto the char object and then use it directly. Unfortunately this hangs:

from bleak import BleakClient
import asyncio

async def main():
    def _disconnected(*args, **kwargs):
        print("DISCONNECT", args, kwargs)

    client = BleakClient("E8EA177F-0419-44A9-B1DD-D96F348B1184")
    client.set_disconnected_callback(_disconnected)

    def _callback(*args, **kargs):
        print("NOTIFY", args, kwargs)

    async with client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)
        await asyncio.sleep(40)
        print("START STOP_NOTIFY")
        await client.stop_notify(char)
        print("STOP STOP_NOTIFY")

    async with client:
        await client.start_notify(char, _callback)
        await client.stop_notify(char)

asyncio.run(main())

I waited about a minute after the "START STOP_NOTIFY" before Ctrl+C

DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x1138a3670>>,) {}
START STOP_NOTIFY
^CTraceback (most recent call last):
  File "/Users/john/Projects/home-assistant/bleaktest.py", line 26, in <module>
    asyncio.run(main())
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 629, in run_until_complete
    self.run_forever()
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 596, in run_forever
    self._run_once()
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 1854, in _run_once
    event_list = self._selector.select(timeout)
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/selectors.py", line 562, in select
    kev_list = self._selector.control(None, max_ev, timeout)
KeyboardInterrupt

I also wondered if a disconnect closed all my notifications for me, so maybe this is ok (its not):

from bleak import BleakClient
import asyncio

async def main():
    def _disconnected(*args, **kwargs):
        print("DISCONNECT", args, kwargs)

    client = BleakClient("E8EA177F-0419-44A9-B1DD-D96F348B1184")
    client.set_disconnected_callback(_disconnected)

    def _callback(*args, **kargs):
        print("NOTIFY", args, kwargs)

    async with client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)
        await asyncio.sleep(40)

    async with client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)

asyncio.run(main())
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10d6c5580>>,) {}
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10d6c5580>>,) {}
Traceback (most recent call last):
  File "/Users/john/Projects/home-assistant/bleaktest2.py", line 23, in <module>
    asyncio.run(main())
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/Users/john/Projects/home-assistant/bleaktest2.py", line 21, in main
    await client.start_notify(char, _callback)
  File "/Users/john/Projects/home-assistant/venv/lib/python3.9/site-packages/bleak/backends/corebluetooth/client.py", line 387, in start_notify
    await self._delegate.start_notifications(characteristic.obj, bleak_callback)
  File "/Users/john/Projects/home-assistant/venv/lib/python3.9/site-packages/bleak/backends/corebluetooth/PeripheralDelegate.py", line 206, in start_notifications
    raise ValueError("Characteristic notifications already started")
ValueError: Characteristic notifications already started

At this point I wondered if a BleakClient was "one-shot" and I needed a new one per connection. This at first seemed to work well. Notifications did indeed work. Unfortunately it seemed to reveal a leak in the disconnect handler:

import asyncio
import gc

def get_client():
    def _disconnected(*args, **kwargs):
        print("DISCONNECT", args, kwargs)

    client = BleakClient("E8EA177F-0419-44A9-B1DD-D96F348B1184")
    client.set_disconnected_callback(_disconnected)

    return client

async def main():
    def _callback(*args, **kargs):
        print("NOTIFY", args, kwargs)

    print("Client 1")
    async with get_client() as client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)
        await asyncio.sleep(40)

    # Try our best to run any __del__ handlers
    client = None
    gc.collect()

    print("Client 2")
    async with get_client() as client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)
        await asyncio.sleep(40)

asyncio.run(main())
Client 1
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10be5eac0>>,) {}
Client 2
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10be5eac0>>,) {}
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10be835b0>>,) {}

You can mitigate this by calling client.set_disconnected_callback(None) before getting a new client. But i feared that the leak was still happening. So i added a new log message to bleak/backends/corebluetooth/client.py in disconnect_callback:

        def disconnect_callback():
            logger.info("disconnect_callback called on %s", id(self))

            self.services = BleakGATTServiceCollection()
            # Ensure that `get_services` retrieves services again, rather
            # than using the cached object
            self._services_resolved = False
            self._services = None

            # If there are any pending futures waiting for delegate callbacks, we
            # need to raise an exception since the callback will no longer be
            # called because the device is disconnected.
            for future in self._delegate.futures():
                try:
                    future.set_exception(BleakError("disconnected"))
                except asyncio.InvalidStateError:
                    # the future was already done
                    pass

            if self._disconnected_callback:
                self._disconnected_callback(self)
from bleak import BleakClient
import asyncio
import gc
import logging

logging.basicConfig(level=logging.INFO)

def get_client():
    def _disconnected(*args, **kwargs):
        print("DISCONNECT", args, kwargs)

    client = BleakClient("E8EA177F-0419-44A9-B1DD-D96F348B1184")
    client.set_disconnected_callback(_disconnected)

    return client

async def main():
    def _callback(*args, **kargs):
        print("NOTIFY", args, kwargs)

    print("Client 1")
    async with get_client() as client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)
        await asyncio.sleep(40)

    client.set_disconnected_callback(None)

    # Try our best to run any __del__ handlers
    client = None
    gc.collect()

    print("Client 2")
    async with get_client() as client:
        char = client.services.characteristics[71]
        await client.start_notify(char, _callback)
        await asyncio.sleep(40)

asyncio.run(main())

Now I get this:

Client 1
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4626001584
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x113bb2eb0>>,) {}
Client 2
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4626209424
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x113be5a90>>,) {}
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4626001584

So while I can detach my own handler, the internal one is still called. So there is still a leak.

Jc2k avatar Feb 11 '22 12:02 Jc2k

Some of the other tickets suggest reusing a BleakClient is currently problematic, and i'd be better off with a fresh one. Which would just leave the leak.

This change at least hides the leak from my logging:

    async def disconnect(self) -> bool:
        """Disconnect from the peripheral device"""
        if (
            self._peripheral is None
 --           or self._peripheral.state() != CBPeripheralStateConnected
        ):
            return True

        await self._central_manager_delegate.disconnect(self._peripheral)

        return True

I now see

Client 1
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4521184656
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10d7bcd90>>,) {}
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4521184656
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10d7bcd90>>,) {}

Client 2
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4521184752
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10d7bcdf0>>,) {}
INFO:bleak.backends.corebluetooth.client:disconnect_callback called on 4521184752
DISCONNECT (<BleakClientCoreBluetooth, E8EA177F-0419-44A9-B1DD-D96F348B1184, <bleak.backends.corebluetooth.client.BleakClientCoreBluetooth object at 0x10d7bcdf0>>,) {}

Previously even after releasing Client 1 we would still see disconnect events for it. This has stopped that.

However the disconnect handler now reliably gets called twice. This is because centralManager_didDisconnectPeripheral_error_ does indeed get called twice.

The first time is when the device disconnects from us. The handler gets called with an error:

Error Domain=CBErrorDomain Code=6 "The connection has timed out unexpectedly." UserInfo={NSLocalizedDescription=The connection has timed out unexpectedly.}

The second time it gets called is when we call disconnect. The handler now gets called with no error.

I also checked this approach against gc.get_object() and it appears to stop BleakClient instances leaking.

Jc2k avatar Feb 11 '22 14:02 Jc2k

#763 fixes the leak, and my notifications work reliably with that - if i use a new BleakClient every time. This is OK with me, but I don't know if thats expected. So I guess that part of my ticket would still stand.

Jc2k avatar Feb 11 '22 15:02 Jc2k