moonboard icon indicating copy to clipboard operation
moonboard copied to clipboard

Only first BT packet received when message is cut because of packet size limit

Open reinarrow opened this issue 3 years ago • 20 comments

I have an issue when sending problems to the Raspberry. I have to click the button at the app many times until it gets displayed to the leds. I started troubleshooting and added a print in the BLE service to see the raw received messages.

image

The problem is that most of the received messages are indeed None, see the screenshot below:

image

Over 20 attemps for 1 problem...

I'm using Raspberry Pi Zero and internal Bluetooth with clean installation and Moonboard app in Android device.

Please let me know if you need more info. Any clues on what could be happening?

Thank you in advance!

reinarrow avatar Feb 27 '21 20:02 reinarrow

Ok, I could dig a little bit more into the issue. I set the BLE service in debugger mode and could see the logs from the UnstuffSequence method process_bytes.

Seems that BLE cuts messages into parts due to packet size limitations and for some reason, most of the times I'm receiving only one of those parts (see below).

image

Why is that happening? Is it a code issue or rather a hardware limitation that the phone is sending the two packets too close in time and one is being lost?

reinarrow avatar Feb 27 '21 22:02 reinarrow

Might have something to do with the configuration of the Gatt server Scan response as mentioned here: https://stackoverflow.com/questions/46847750/raspberry-pi3-ble-as-central-device-receiving-just-one-packet-per-connection-int I'm still a bit unfamiliar with this, so any clue to point me in the right direction is more than welcome!

reinarrow avatar Feb 27 '21 23:02 reinarrow

Ok, new update: Bluetooth is correctly receiving all the packets, I could check it by using the utility btmon. Everytime I send a problem, the full array of holds is received, but I guess some of them are lost in the dbus process and the application only sees some of them. I'm starting to investigate the dbus with dbus-monitor.

I would appreciate any help.

Thank you.

reinarrow avatar Mar 03 '21 19:03 reinarrow

I've tried multithreading for the problems processing to check if it was a performance issue.

`class ProblemWorker(threading.Thread): def init(self, unstuffer, logger, q, moonApp, *args, **kargs): super(ProblemWorker, self).init(*args, **kargs) self.unstuffer = unstuffer self.logger = logger self.q = q self.moonApp = moonApp

def run(self):
    self.logger.info('start thread')
    while True:
        self.logger.info('Thread waiting for new problem')
        ba = self.q.get()
        self.logger.info('Processing problem in thread')
        new_problem_string= self.unstuffer.process_bytes(ba)
        if new_problem_string is not None:
            problem= decode_problem_string(new_problem_string)
            self.moonApp.new_problem(json.dumps(problem))
        self.q.task_done()

class MoonApplication(dbus.service.Object): IFACE = "com.moonboard.method" def init(self, bus, socket,logger): self.path = '/com/moonboard' self.services = [] self.logger=logger self.socket=socket self.unstuffer= UnstuffSequence(self.logger) dbus.service.Object.init(self, bus, self.path) self.add_service(UartService(bus,self.get_path(), 0, self.process_rx)) self.problems_queue = Queue()

    self.workers = [ProblemWorker(unstuffer = self.unstuffer, logger = self.logger, q = self.problems_queue, moonApp = self, name = 'a', daemon = True),
                    ProblemWorker(unstuffer = self.unstuffer, logger = self.logger, q = self.problems_queue, moonApp = self, name = 'b', daemon = True)]

    for worker in self.workers:
        worker.start()

def process_rx(self,ba):
    # Save incoming message into queue
    self.problems_queue.put(ba) 
    self.logger.info("Saving new problem into buffer")

`

Same behavior...

reinarrow avatar Mar 04 '21 11:03 reinarrow

Short analysis of BT packet timing. I sent the same problem several times, one that I know that occupies 2 packets and measured time difference between the first and second packet (by using btmon). Then, the right columns show if each packet was received.

image

Note that I did not receive so many correct problems in a row, I cut some of the failing attempts to have more positive samples in the image.

Main conclusions:

  • No packets are lost in the BLE transmission (all are seen by the Raspberry BLE)
  • Some packets closer in time were correctly processed while other with more separation were failed to be received. The timing is not necessarily the issue...
  • For me, the issue is quite random, which makes me think of problems with performance depending on RAM/disk usage. I don't know if this makes sense actually...

reinarrow avatar Mar 09 '21 22:03 reinarrow

Did you try to play with the connection timings, i.e.:

Approach 1 (maybe not working?):

  • hcitool -i hci0 cmd 0x08 0x0009 0d 0c 09 4d 6f 6f 6e 62 6f 61 72 64 20 41

Approach 2 (maybe working):

  • echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval
  • echo 7 > /sys/kernel/debug/bluetooth/hci0/conn_max_interval

8cH9azbsFifZ avatar Mar 10 '21 11:03 8cH9azbsFifZ

Did you try to play with the connection timings, i.e.:

Approach 1 (maybe not working?):

  • hcitool -i hci0 cmd 0x08 0x0009 0d 0c 09 4d 6f 6f 6e 62 6f 61 72 64 20 41

Approach 2 (maybe working):

  • echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval
  • echo 7 > /sys/kernel/debug/bluetooth/hci0/conn_max_interval

I already tried both without success (well maybe for the first one I did not use those exact params, will give it a try).

Anyway, I would say that the Bluetooth connection interval has nothing to do, as I am correctly receiving all the packets (I can see them with btmon). I suspect that the data loss happens from the BLE to the UART.

At this point, I think that it is a UART buffering problem. I'm not familiar with how the internal implementation is dealing with UART RX buffer when a new data arrives without the previous one having been processed.

reinarrow avatar Mar 10 '21 11:03 reinarrow

Did you try to play with the connection timings, i.e.:

Approach 1 (maybe not working?):

* hcitool -i hci0 cmd _0x08 0x0009_ 0d 0c 09 4d 6f 6f 6e 62 6f 61 72 64 20 41

Approach 2 (maybe working):

* echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval

* echo 7 > /sys/kernel/debug/bluetooth/hci0/conn_max_interval

As he mentioned, i used the "echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval" Method, because i have the same issue as you. With Android my BLE packages are sent wiht 1,5ms to 5ms pause, which cannot be processed quickly enough. Setting conn_min_interval zu 6 means a time interval with 7,5 ms. Now more requests are recognised as valid, but i have still some which do not work. Do you have iOs or Android?

Friesi avatar Mar 10 '21 11:03 Friesi

Did you try to play with the connection timings, i.e.: Approach 1 (maybe not working?):

* hcitool -i hci0 cmd _0x08 0x0009_ 0d 0c 09 4d 6f 6f 6e 62 6f 61 72 64 20 41

Approach 2 (maybe working):

* echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval

* echo 7 > /sys/kernel/debug/bluetooth/hci0/conn_max_interval

As he mentioned, i used the "echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval" Method, because i have the same issue as you. With Android my BLE packages are sent wiht 1,5ms to 5ms pause, which cannot be processed quickly enough. Setting conn_min_interval zu 6 means a time interval with 7,5 ms. Now more requests are recognised as valid, but i have still some which do not work. Do you have iOs or Android?

I'm using Android. I just tried the "echo 6 > /sys/kernel/debug/bluetooth/hci0/conn_min_interval" and nothing changed. Then I noticed that this is overwritten upon connection, so I included it in the code to be done everytime and now: Surprise! It actually improved a little bit.

But as @Friesi said, still many packets lost. I still don't fully understand why with btmon I see all the packets.

reinarrow avatar Mar 10 '21 12:03 reinarrow

As reading this now, i think i have never tried to set it to 1 because i read that the minimum time is 7,5ms (so 6). But maybe it is working? Did you try this?

Friesi avatar Mar 10 '21 12:03 Friesi

It was worth it to give it a try. But unfortunately, I get echo: write error: Invalid argument for any value less than 6.

reinarrow avatar Mar 10 '21 12:03 reinarrow

Problem with bluetooth is more complex. I have two RPI with the same MoonBoard software: RPI 3B+ and Rapi W Zero and two Android phones Samsung S9 and S10e S9 to W Zero - connection OK, all BT packets received S10e to W Zero - connection OK, only first BT packet received S9 to RPI 3B+ - connection OK, only first BT packet received S10e to RPI 3B+ - no connection, app is searching for Moonboard but BT connection icon does not show up

marcelbesoeu avatar Mar 10 '21 18:03 marcelbesoeu

That's really disappointing. Any chances that we can work on the BLE app implementation to try to improve this situation? I'm not very familiar with it yet, but would be glad to try to collaborate. @8cH9azbsFifZ @marcelbesoeu @e-sr

reinarrow avatar Mar 11 '21 09:03 reinarrow

It looks like the Moonboard app itself is quite problematic -> https://www.reddit.com/r/climbharder/comments/eryae9/android_moonboard_app_problems/

I would love to see the moonboard team making the app open source...

reinarrow avatar Mar 11 '21 09:03 reinarrow

All packets are trasmitted from phone to RPI Bluetooth what we can see on BTMON, but only first packet is received on DBUS. Maybe there is another way to monitor BT transmission (catch and parse BTMON console output ???)

marcelbesoeu avatar Mar 11 '21 11:03 marcelbesoeu

All packets are trasmitted from phone to RPI Bluetooth what we can see on BTMON, but only first packet is received on DBUS. Maybe there is another way to monitor BT transmission (catch and parse BTMON console output ???)

I like the idea. Will give it a try this afternoon!

reinarrow avatar Mar 11 '21 11:03 reinarrow

Maybe the BT problem is old "ble/gatt_base" code? I found project: https://punchthrough.com/creating-a-ble-peripheral-with-bluez/ . Final code looks much simplier.

marcelbesoeu avatar Mar 11 '21 19:03 marcelbesoeu

Maybe the BT problem is old "ble/gatt_base" code? I found project: https://punchthrough.com/creating-a-ble-peripheral-with-bluez/ . Final code looks much simplier.

I went through the code in that project and it contains the same elements as the gatt_base folder. The only difference is that they put all the classes in a single file in there.

reinarrow avatar Mar 11 '21 19:03 reinarrow

UPDATE!

I implemented a parser for the btmon output and could make it work, now I receive 100% of packets! It was a bit tricky because the stdout was being buffered and I had to force it to unbuffer by implementing a custom class for streaming the data.

Huge thanks to @marcelbesoeu for the suggestion :D

You can find it in my fork of the moonboard app. If the feedback is good, I will issue a Pull Request to the original repo.

reinarrow avatar Mar 12 '21 10:03 reinarrow

Merged the commit into my fork - first test looks fine with my setup.

8cH9azbsFifZ avatar Mar 13 '21 16:03 8cH9azbsFifZ