micropython-lib icon indicating copy to clipboard operation
micropython-lib copied to clipboard

aioble: Ordered handling of char.written().

Open andrewleech opened this issue 3 years ago • 5 comments

On a ble peripheral system with a "significant" number of background asyncio tasks, reception & handling concurrent characteristic writes (eg. a looping task with await characteristic.written() can occur out of order.

For some services this can severly impact function, where writes to multiple characteristics to perform functions is critical. For example, in the Object Transfer Service it's common to "set a list filter" on one char, then "send the next list item" on a separate char. If these are handled out of order incorrect results are sent.

Essentially, if the vm / asyncio loop is slowed down by a background task (in this case a time.sleep()) while two ble write operations are received by the stack, the asyncio.Event for both characteristics can be set before either aioble characteristic event handler services them.

In this case, the characteristic.written() event of which ever task is checked first (whichever one was originally registered first) will be serviced first, regardless of which one came in first.

The first commit in this PR adds a multitest that demonstrates this behavior, failing as some of the messages arrive out of order.

The second commit addresses this issue by re-working the existing Characteristic(capture=True) functionality to work through a single shared deque which is handled by a background async task. This task pops individual incoming write events from the shared queue and forwards them to the respective characteristic event sequentially. Only once that characteristic.written() handles the event does the manager task continue to any other events in the shared queue.

Characteristics configured with capture=False (default) still behave the same as previously.


This work was funded by Planet Innovation.

andrewleech avatar Sep 16 '22 01:09 andrewleech

Just to be clear, by "out of order characteristic writes" you mean separate (independent!) characteristics appear written in a different order on the server, compared to the order they were written to on the client? Eg client writes to A then B, but server sees B update first then A?

It seems to me like there is no guarantee of the ordering of writes to independent characteristics, precisely because they are independent.

Or is the issue that the stack (nimble) does get the writes in the "correct" order (eg A then B) but aioble swaps them around?

dpgeorge avatar Sep 16 '22 12:09 dpgeorge

Yes client writes to charA then charB, server nimble stack gets them in order, but once they go via asyncio events they can be handled out of order by tasks waiting on their respective events.

The issue only occurs when both / multiple events get set faster than asyncio can "get around the loop" to check/ service all the tasks waiting on events.

In practice we've seen it all work properly most of the time, events generally get serviced in order, except intermittently some might get swapped. The theory was that the loop got "paused" briefly while, eg. a file write was occurring. Then whichever event.wait task got checked first in the loop would be handled first. The multitest included here supports this theory.

The stack itself guarantees the order of the writes, and some standard services rely on the order of these writes - OTS in particular requires it in a number of situations.

andrewleech avatar Sep 17 '22 00:09 andrewleech

While the fix presented in this MR is very specific to this particular user case, the question on event order might be worth thinking about more broadly for asyncio in general? Probably more of a potential issue if/when deeper asyncio hardware support is added, eg multiple tasks async waiting on pin /switch press, needing patterns of usage so they get handled in order etc.

andrewleech avatar Sep 17 '22 00:09 andrewleech

Thanks @andrewleech -- what you're describing makes sense. GATT itself will preserve the event order, and aioble doesn't make any attempt to preserve it across characteristics. I hadn't considered that there would be ordering dependencies across capture=True characteristics.

I will need to spend some time reviewing this, the capture stuff is pretty subtle. I'm glad you were able to keep the same API.

I'm a little bit confused why there needs to be both a global queue and a per-characteristic queue. Will the per-characteristic queue ever have more than one item in it -- it seems to me that the monitor task will always wait for the characteristic to dispatch before being able to append a new item?

Another thing to consider (although doesn't have to be this PR), is that we also do the same capture queue thing for notifications. Should we apply the same fix there?

jimmo avatar Sep 18 '22 11:09 jimmo

Thanks @jimmo yeah I wasn't 100% convinced I should hijack the existing capture arg however thought that'd be safer that maintaing / documenting two separate queueing mechanisms in the library.

I'm a little bit confused why there needs to be both a global queue and a per-characteristic queue

I initially kept this to reduce the size of this patch, don't change what's not broke etc - particularly when I saw other usages of deque(1) in other places. However I've now stripped it back to just an single attribute on the Char class and it does simplify things a fair bit.

Another thing to consider (although doesn't have to be this PR), is that we also do the same capture queue thing for notifications. Should we apply the same fix there?

Ah, yes it would certainly make sense to have the same pattern there, I can easily foresee situations where the order of notifications is important.

andrewleech avatar Sep 21 '22 01:09 andrewleech

Thanks @andrewleech this is a really good simplification and much easier to reason about.

Merged with some minor changes in https://github.com/micropython/micropython-lib/commit/eba897420d96650a372b4b65831c24741d667890

jimmo avatar Sep 27 '22 12:09 jimmo