bleak icon indicating copy to clipboard operation
bleak copied to clipboard

GATT Connection drops after multiple reads to characteristic

Open danbachar opened this issue 4 months ago • 5 comments

Hey all,

I'm trying to measure the theoretical throughput as distance changes of the GATT protocol, and for that I have two identical Raspberry Pi 3b+, running Debian 12 Bookwork on aarch64 with the latest BlueZ version 5.83 which I built from source. To do that I have a bluetoothctl GATT server running on the first pi, gr1, that I set up with the following script:

sudo bluetoothctl
advertise.manufacturer 0xffff 0x12 0x34
advertise.name gr1
advertise.tx-power on
gatt.register-service ffffffff-ffff-ffff-ffff-ffffffffffff
gatt.register-characteristic AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA read,write
gatt.register-application
advertise on
discoverable on
pairable on

(If somebody has an idea how I could automate it without the GATT server dying as the bluetoothctl command executes, please let me know!)

To connect to the server, I have another pi sitting directly on top of the other one: gr2, which uses the following bleak python script to read the attribute value as many times as possible within a minute, after gr1 was discovered and service discovery was completed:

#!/usr/bin/env python3
import asyncio
import logging
import argparse
import time
from bleak import (BleakScanner, BleakClient)
from bleak.backends.device import BLEDevice
from bleak.backends.scanner import AdvertisementData

from socket import gethostname

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class textcolor:
    GREEN = '\033[92m'
    RED = '\033[91m'
    CYAN = '\033[96m'
    YELLOW = '\033[93m'
    GOLD = '\033[33m'
    BOLD = '\033[1m'
    END = '\033[0m'

class ThroughputTestPeer:
    def __init__(self, device_name: str, scan_duration: float = 5.0):
        self.device_name = device_name
        self.loop = asyncio.get_event_loop()
        self.trigger = asyncio.Event()
        self.scan_duration = scan_duration

    def is_trusted_peer(self, name: str) -> bool:
        return name.startswith("gr")

    async def scan_for_peers(self):
        """Scan for trusted GATT servers. A trusted server is a server whose name begins with 'gr'."""

        devices: dict[str, tuple[BLEDevice, AdvertisementData]] = {}
        def detection_callback(device: BLEDevice, advertisement_data: AdvertisementData):
            device_name: str = (advertisement_data.local_name and advertisement_data.local_name) or (device.name and device.name) or ""

            if self.is_trusted_peer(device_name):
                print(f"🤝 Found trusted peer: {device_name} ({device.address}) RSSI: {advertisement_data.rssi}")
                devices[device.address] = (device, advertisement_data)

        scanner = BleakScanner(detection_callback=detection_callback)
        await scanner.start()
        await asyncio.sleep(self.scan_duration)
        await scanner.stop()

        print(f"🤝 Found {len(devices.keys())} trusted GATT servers.")
        if len(devices.keys()) == 0:
            return
        print("Sleeping (open btmon now!!!)")
        time.sleep(5)
        print("Requesting services...")

        SERVICE_UUID = "ffffffff-ffff-ffff-ffff-ffffffffffff"

        for device_address in devices:
            device,_adv = devices[device_address]
            try:
                async with BleakClient(device, pair=True) as client:
                    for service in client.services:
                        if service.uuid == SERVICE_UUID:
                            for c in service.characteristics:
                                if "aaaa" in c.uuid.lower():
                                    count = 0
                                    print(f"{textcolor.CYAN}Reading characteristic for 60 seconds...{textcolor.END}")
                                    start = time.time()
                                    end = start + 60 # seconds
                                    while time.time() <= end:
                                        value = await client.read_gatt_char(c.uuid)
                                        value_parsed = int.from_bytes(value)
                                        print(f"📦 Read value: {value_parsed}")
                                        count += len(value)
                                        time.sleep(0.250)
                                    print(f"{textcolor.GREEN}✅ Done! Read {count} bytes in 60 seconds ({count/60} B/s){textcolor.END}")

            except asyncio.exceptions.TimeoutError:
                print(f'{textcolor.RED}TimeoutError:{textcolor.END} Device at address `{device_address}` timed out.')
            except Exception as error:
                print(f'Exception: An error occurred while connecting to device `{device_address}`:\n\t{error}')

def parse_arguments():
    """Parse command line arguments"""
    parser = argparse.ArgumentParser(description='BLE GATT Test')
    peer_name = gethostname()
    parser.add_argument('-n', '--peer-name', default=peer_name, 
                       help=f'Peer name (default: {peer_name})')
    
    return parser.parse_args()

async def main():
    args = parse_arguments()
    peer = ThroughputTestPeer(
        device_name=args.peer_name
    )
    await peer.scan_for_peers()

if __name__ == "__main__":
    asyncio.run(main())

The thing is, I get a disconnect directly after the second read. I put the sleep there in order to not overwhelm the bluetooth stack, but it doesn't help. With a lower sleep time I can get more reads in before the disconnect, but it still disconnects. Having another terminal window open with a bluetoothctl connected session on the second pi completely solved the problem, so my thinking is that it has something to do with a connection timeout on the bleak stack? I am not sure. Anyway I am happy to attach any btmon logs as needed.

danbachar avatar Sep 03 '25 19:09 danbachar

If Wi-Fi is on on the RPi, try turning it off. It is a known issue that it interferes with Bluetooth.

The btmon logs should tell you if it is BlueZ that is disconnecting (it might do this if the server sent an invalid response) or if the server is disconnecting (because the server died).

dlech avatar Sep 03 '25 21:09 dlech

Hey David, Thank you for your quick answer. I tried turning it off using both nmcli and rfkill block- did not have any influence. I connected the GATT server over ethernet to my router. I attach below the read request/responses from btmon on both rPis. log from gr1:

ATT: Read Request (0x0a) len 2
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
bluetoothd[453]: < ACL Data TX: H.. flags 0x00 dlen 6 #298 [hci0] 346.853636
ATT: Read Response (0x0b) len 1
Value[1]: 0d
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
Value[1]: 0d
> HCI Event: Number of Completed Pack.. (0x13) plen 5 #299 [hci0] 346.946875
Num handles: 1
Handle: 64 Address: B8:27:EB:86:98:51 (Raspberry Pi Foundation)
Count: 2
#296: len 5 (0 Kb/s)
Latency: 194 msec (47-487 msec ~179 msec)
#298: len 6 (0 Kb/s)
Latency: 93 msec (47-487 msec ~136 msec)
> ACL Data RX: Handle 64 flags 0x02 dlen 7 #300 [hci0] 347.921942
ATT: Read Request (0x0a) len 2
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
bluetoothd[453]: < ACL Data TX: H.. flags 0x00 dlen 6 #301 [hci0] 347.925781
ATT: Read Response (0x0b) len 1
Value[1]: 0d
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
Value[1]: 0d
bluetoothd[453]: @ MGMT Command.. (0x003f) plen 1 {0x0001} [hci0] 348.068074
Instance: 1
@ MGMT Event: Advertising Removed (0x0024) plen 1 {0x0002} [hci0] 348.068158
Instance: 1
< HCI Command: LE Set Advertis.. (0x08|0x000a) plen 1 #302 [hci0] 348.068244
Advertising: Disabled (0x00)
> HCI Event: Command Complete (0x0e) plen 4 #303 [hci0] 348.069113
LE Set Advertise Enable (0x08|0x000a) ncmd 1
Status: Success (0x00)
@ MGMT Event: Command Complete (0x0001) plen 4 {0x0001} [hci0] 348.069254
Remove Advertising (0x003f) plen 1
Status: Success (0x00)
Instance: 1
> HCI Event: Number of Completed Pack.. (0x13) plen 5 #304 [hci0] 348.171696
Num handles: 1
Handle: 64 Address: B8:27:EB:86:98:51 (Raspberry Pi Foundation)
Count: 1
#301: len 6 (0 Kb/s)
Latency: 245 msec (47-487 msec ~191 msec)
> HCI Event: Disconnect Complete (0x05) plen 4 #305 [hci0] 348.750960
Status: Success (0x00)
Handle: 64 Address: B8:27:EB:86:98:51 (Raspberry Pi Foundation)
Reason: Remote User Terminated Connection (0x13)
@ MGMT Event: Device Disconnected (0x000c) plen 8 {0x0002} [hci0] 348.751047
LE Address: B8:27:EB:86:98:51 (Raspberry Pi Foundation)
Reason: Connection terminated by remote host (0x03)
@ MGMT Event: Device Disconnected (0x000c) plen 8 {0x0001} [hci0] 348.751047
LE Address: B8:27:EB:86:98:51 (Raspberry Pi Foundation)
Reason: Connection terminated by remote host (0x03)
< HCI Command: LE Set Scan Par.. (0x08|0x000b) plen 7 #306 [hci0] 348.782909
Type: Passive (0x00)
Interval: 60.000 msec (0x0060)
Window: 30.000 msec (0x0030)
Own address type: Public (0x00)
Filter policy: Ignore not in accept list (0x01)
> HCI Event: Command Complete (0x0e) plen 4 #307 [hci0] 348.783695
LE Set Scan Parameters (0x08|0x000b) ncmd 1
Status: Success (0x00)
< HCI Command: LE Set Scan Ena.. (0x08|0x000c) plen 2 #308 [hci0] 348.784079
Scanning: Enabled (0x01)
Filter duplicates: Enabled (0x01)
> HCI Event: Command Complete (0x0e) plen 4 #309 [hci0] 348.784666
LE Set Scan Enable (0x08|0x000c) ncmd 1
Status: Success (0x00)

gr2 logs:

ATT: Read Request (0x0a) len 2
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
> HCI Event: Number of Completed Pac.. (0x13) plen 5 #3187 [hci0] 347.845419
Num handles: 1
Handle: 64 Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
Count: 2
#3184: len 9 (0 Kb/s)
Latency: 145 msec (26-145 msec ~103 msec)
#3186: len 7 (1 Kb/s)
Latency: 48 msec (26-145 msec ~76 msec)
> ACL Data RX: Handle 64 flags 0x02 dlen 6 #3188 [hci0] 347.894130
ATT: Read Response (0x0b) len 1
Value[1]: 0d
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
Value[1]: 0d
bluetoothd[1924]: < ACL Data TX:.. flags 0x00 dlen 7 #3189 [hci0] 348.898379
ATT: Read Request (0x0a) len 2
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
> ACL Data RX: Handle 64 flags 0x02 dlen 6 #3190 [hci0] 348.966632
ATT: Read Response (0x0b) len 1
Value[1]: 0d
Handle: 0x001e Type: Vendor specific (aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa)
Value[1]: 0d
> HCI Event: Number of Completed Pac.. (0x13) plen 5 #3191 [hci0] 349.085345
Num handles: 1
Handle: 64 Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
Count: 1
#3189: len 7 (0 Kb/s)
Latency: 186 msec (26-186 msec ~131 msec)
bluetoothd[1924]: @ MGMT Comman.. (0x0014) plen 7 {0x0001} [hci0] 349.511179
LE Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
< HCI Command: Disconnect (0x01|0x0006) plen 3 #3192 [hci0] 349.511373
Handle: 64 Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
Reason: Remote User Terminated Connection (0x13)
> HCI Event: Command Status (0x0f) plen 4 #3193 [hci0] 349.511822
Disconnect (0x01|0x0006) ncmd 1
Status: Success (0x00)
> HCI Event: Disconnect Complete (0x05) plen 4 #3194 [hci0] 349.552132
Status: Success (0x00)
Handle: 64 Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
Reason: Connection Terminated By Local Host (0x16)
@ MGMT Event: Device Disconnected (0x000c) plen 8 {0x0001} [hci0] 349.552173
LE Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
Reason: Connection terminated by local host (0x02)
@ MGMT Event: Command Complete (0x0001) plen 10 {0x0001} [hci0] 349.601713
Disconnect (0x0014) plen 7
Status: Success (0x00)
LE Address: B8:27:EB:E4:AB:2A (Raspberry Pi Foundation)
bluetoothd[1924]: = No matching connection for device [hci0] 349.602984

danbachar avatar Sep 03 '25 21:09 danbachar

The side running Bleak is initiating the disconnect. So I'm guessing either the Python program is throwing an exception or condition of the while loop is no longer true.

dlech avatar Sep 03 '25 21:09 dlech

The python script indeed does throw an exception that says the connection was terminated. Currently not next to the rPis but as soon as I will be I'll send you the exact details. I was able to do a workaround by having prior to this script execution bluetoothctl manually pairing between the pies.

Otherwise, do you have an idea why the size of the characteristic is limited to 8 bits by chance? for all I know the size of the char should be variable according to docs, no?

danbachar avatar Sep 03 '25 22:09 danbachar

I didn't even know that it was possible to do gatt.register-characteristic in bluetoothctl. I would not expect that to be very flexible. There is a Python library called bless that should allow creating a BLE server using BlueZ that might work better.

What is the full stack trace of the exception?

dlech avatar Sep 03 '25 22:09 dlech