NimBLE-Arduino icon indicating copy to clipboard operation
NimBLE-Arduino copied to clipboard

Reset scanning state without resetting esp32

Open asukiaaa opened this issue 3 years ago • 16 comments

Thank you for sharing this useful library. I want some advice about scanning.

I'm making a library to use xbox controller with using esp32. https://github.com/asukiaaa/arduino-XboxSeriesXControllerESP32

Sometimes scanning cannot find no device. I called deinit after scanning and add false for start scan but they don't change state. After resetting esp32, scanning may find a controller. (sometimes not so I reset again.)

Are there any way to reset scanning state other than the followings?

By the way, failing connection to the advertised device is decreased by calling deinit after failing it.

Thank you.

asukiaaa avatar Jul 18 '22 15:07 asukiaaa

If the scan is not producing results it's probably because it wasn't started, you should consider checking the return value from start() to see if there was an error, also enable debug logging to see what the error is.

The scan state gets reset each time it's stopped/started, there should be no need to reset anything.

h2zero avatar Jul 18 '22 16:07 h2zero

Thank you for the information. I checked logs with setting level as 5.

pattern1: matching name

I print toString info by advertized device.

Then I found that target device address was found but name was blank when the problem occurs.

Result of toString by advertized device should be

Name: Xbox Wireless Controller, Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviceUUID: 0x1812, txPower: 20

but becomes like this sometime.

Name: , Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviceUUID: 0x1812, txPower: 20

At completing discovery, 44:16:22:5e:b2:d4 was found without name so that's why my program cannot detect it as xbox controller in this case. (active scan is disabled?) The results seems be kept for next scanning.

D NimBLEScan: discovery complete; reason=0
Name: , Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviceUUID: 0x1812, txPower: 20
Name: , Address: 41:a7:d5:0e:27:84, manufacturer data: 4c0010051a189e9c54, txPower: 12
Name: , Address: 68:0f:02:3d:8b:f9, manufacturer data: 4c00100506189d1e8b, txPower: 12
Name: , Address: 5a:3d:3e:dd:f8:cd, manufacturer data: 4c001006001a78c0c485, txPower: 12
Name: , Address: 6c:51:a7:c6:56:e2, manufacturer data: 4c0010052a187aa3bc, txPower: 8
Name: , Address: 62:64:d6:0c:10:ed, manufacturer data: 4c0010051c189161bc, txPower: 12
Name: , Address: 75:be:2e:ae:83:47, manufacturer data: 4c0010050e18123474, txPower: 12
I NimBLEDevice: BLE Host Task Started

I NimBLEDevice: NimBle host synced.
D NimBLEDevice: Setting bonding: 1, mitm: 0, sc: 0
D NimBLEDevice: >> setPower: 7 (type: 11)
D NimBLEDevice: << setPower
D NimBLEScan: >> start: duration=10
D NimBLEScan: << start()

Some times esp32 panicked and restarts at printing scan results.

not connected at 221284
not connected at 221784
D NimBLEScan: discovery complete; reason=0
onResult: Name: , Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviGuru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x00000000  PS      : 0x00060830  A0      : 0x80095003  A1      : 0x3ffc5f50  
A2      : 0x3ffbcde8  A3      : 0x00000001  A4      : 0x00000002  A5      : 0x00036448  
A6      : 0x00000000  A7      : 0x00000001  A8      : 0x800e4fc4  A9      : 0x3ffc5f30  
A10     : 0x3ffc28ec  A11     : 0x3ffc28ec  A12     : 0x3ffc3940  A13     : 0x00060823  
A14     : 0x00060820  A15     : 0x00000001  SAR     : 0x00000019  EXCCAUSE: 0x00000014  
EXCVADDR: 0x00000000  LBEG    : 0x4008fd74  LEND    : 0x4008fd8a  LCOUNT  : 0x00000000  


Backtrace:0xfffffffd:0x3ffc5f500x40095000:0x3ffc5f70 




ELF file SHA256: 0000000000000000

Rebooting...

pattern2: matching address.

I checked log with specifying controller by address and not printing toString or getName of the advertised device.

Device was found but not connected sometime (until panic and resetting).

D NimBLEScan: discovery complete; reason=0
D NimBLEScan: >> stop()
D NimBLEScan: << stop()
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> deleteServices
D NimBLEClient: << deleteServices
I NimBLEDevice: BLE Host Task Started
I NimBLEDevice: NimBle host synced.
D NimBLEDevice: Setting bonding: 1, mitm: 0, sc: 0
D NimBLEDevice: >> setPower: 7 (type: 11)
D NimBLEDevice: << setPower
not connected at 953683
I NimBLEDevice: BLE Host Task Started

Sometimes esp32 panicked and restarts at printing scan results.

not connected at 221284
not connected at 221784
D NimBLEScan: discovery complete; reason=0
onResult: Name: , Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviGuru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x00000000  PS      : 0x00060830  A0      : 0x80095003  A1      : 0x3ffc5f50  
A2      : 0x3ffbcde8  A3      : 0x00000001  A4      : 0x00000002  A5      : 0x00036448  
A6      : 0x00000000  A7      : 0x00000001  A8      : 0x800e4fc4  A9      : 0x3ffc5f30  
A10     : 0x3ffc28ec  A11     : 0x3ffc28ec  A12     : 0x3ffc3940  A13     : 0x00060823  
A14     : 0x00060820  A15     : 0x00000001  SAR     : 0x00000019  EXCCAUSE: 0x00000014  
EXCVADDR: 0x00000000  LBEG    : 0x4008fd74  LEND    : 0x4008fd8a  LCOUNT  : 0x00000000  


Backtrace:0xfffffffd:0x3ffc5f500x40095000:0x3ffc5f70 




ELF file SHA256: 0000000000000000

Rebooting...

The scan state gets reset each time it's stopped/started, there should be no need to reset anything.

I feel that scan results are hold and not updated by scanning. Is there any way to update advertised devices?

Thank you.

asukiaaa avatar Jul 19 '22 15:07 asukiaaa

At completing discovery, 44:16:22:5e:b2:d4 was found without name so that's why my program cannot detect it as xbox controller in this case. (active scan is disabled?)

The cause of this is due to the library reporting the device when the scan ends regardless of having the complete data or not. The scan response had not yet been received when the scan ended and therefore reported what it had so devices would not go unreported.

The backtrace of those crashes indicates a nullptr access somwhere, could you decode those and post the result?

h2zero avatar Jul 19 '22 23:07 h2zero

Here you are. Both backtraces show same function prvProcessReceivedCommands.

Failed to call toString.

not connected at 10770
D NimBLEScan: discovery complete; reason=0
Name: , Address: 70:ae:cd:be:34:cd, manufacturer data: 4c001005471c63c511, txPower: 12
Name: , AGuru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x00000000  PS      : 0x00060230  A0      : 0x80095003  A1      : 0x3ffc5f50  
A2      : 0x3ffbcde8  A3      : 0x00000001  A4      : 0x00000002  A5      : 0x00002c02  
A6      : 0x00000000  A7      : 0x00000001  A8      : 0x800e8989  A9      : 0x3ffc5f30  
A10     : 0x3ffc28ec  A11     : 0x3ffbefb0  A12     : 0x3ffc3940  A13     : 0x00060223  
A14     : 0x00060220  A15     : 0x00000001  SAR     : 0x00000019  EXCCAUSE: 0x00000014  
EXCVADDR: 0x00000000  LBEG    : 0x4008fd74  LEND    : 0x4008fd8a  LCOUNT  : 0x00000000  


Backtrace:0xfffffffd:0x3ffc5f500x40095000:0x3ffc5f70 

  #0  0xfffffffd:0x3ffc5f50 in ?? ??:0
  #1  0x40095000:0x3ffc5f70 in prvProcessReceivedCommands at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/timers.c:852
      (inlined by) prvTimerTask at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/timers.c:600

Failed to stop scanning.

not connected at 14209
D NimBLEScan: discovery complete; reason=0
D NimBLEScan: >> stop()
E NimBLEScan: Failed to cancel scan; rc=30
Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x00000000  PS      : 0x00060630  A0      : 0x80095003  A1      : 0x3ffc5f50  
A2      : 0x3ffbcd80  A3      : 0x00000001  A4      : 0x00000001  A5      : 0x00003971  
A6      : 0x00000000  A7      : 0x00000001  A8      : 0x800e7f75  A9      : 0x3ffc5f30  
A10     : 0x3ffc28ec  A11     : 0x3ffbefb0  A12     : 0x3ffc3940  A13     : 0x00060623  
A14     : 0x00060620  A15     : 0x00000001  SAR     : 0x00000019  EXCCAUSE: 0x00000014  
EXCVADDR: 0x00000000  LBEG    : 0x4008fd74  LEND    : 0x4008fd8a  LCOUNT  : 0x00000000  


Backtrace:0xfffffffd:0x3ffc5f500x40095000:0x3ffc5f70 

  #0  0xfffffffd:0x3ffc5f50 in ?? ??:0
  #1  0x40095000:0x3ffc5f70 in prvProcessReceivedCommands at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/timers.c:852
      (inlined by) prvTimerTask at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/timers.c:600

My environment

  • PlatformIO 6.1.1
  • platformio-esp32 5.0.0 (esp-idf4.4.1)

asukiaaa avatar Jul 23 '22 14:07 asukiaaa

Something strange is happening here. What version of NimBLE are you using? The error in the timer command function has had a workaround for a while now and should not be happening. This was also fixed upstream yesterday finally https://github.com/espressif/esp-idf/commit/5e1437deb84ff7c3becc836a58e45c089e9e3c32

~~Also this: E NimBLEScan: Failed to cancel scan; rc=30 is quite puzzling as there is no error code 30 (0x1E) as far as I know...~~ Just took a look for that code, it is BLE_HS_EDISABLED, so the host was disabled when the call to stop the scan was made.

h2zero avatar Jul 23 '22 15:07 h2zero

Thank you for checking.

What version of NimBLE are you using?

Current latest version 1.4.0.

This was also fixed upstream yesterday finally https://github.com/espressif/esp-idf/commit/5e1437deb84ff7c3becc836a58e45c089e9e3c32

I expect that future release of esp-idf including the commit solves the problem.

Just took a look for that code, it is BLE_HS_EDISABLED, so the host was disabled when the call to stop the scan was made.

Can I do something around starting scaning or finishing scanning?

asukiaaa avatar Jul 23 '22 17:07 asukiaaa

@asukiaaa I've tried your example code to reproduce the error and have not been successful yet. I removed the calls to reset() and there has been no problem.

One thing I noticed is that in the scan result callback you are stopping the scan before setting the advDevice object. By doing this you could create a state in loop where the scan is stopped and there is no advDevice so it starts it again. I suggest just removing the call to stop scan in the result callback since it will be stopped when connecting to the server.

h2zero avatar Jul 24 '22 15:07 h2zero

Thank you for checking. I remove reset and pScan->stop then it seems not cause panick. However, if it cannot get name of the device, it cannot connect to device even if secifying by address. If I reset esp32, some state seems be resetted and can get name of the device and succeeded in connection sometime.

I want to reset some state without resetting esp32.

asukiaaa avatar Jul 25 '22 14:07 asukiaaa

Try extending your scan time. What could be happening is the scan is stopping before the controller has a chance to respond the the scan response request (which is where the name data comes from). The cause could also be that the controller simply has stopped advertising and is no longer listening for connections/scan requests. If there is any other data in the advertisement you could use for detection I would suggest trying to use that instead of the name since it would be in the initial advertisement packet so you would not need the scan response to detect it.

h2zero avatar Jul 27 '22 00:07 h2zero

Thank you for the advice.

Try extending your scan time.

I test with setting scan time as 10 seconds. (However device found log shown in few seconds so 4 seconds seem enough.)

the controller simply has stopped advertising and is no longer listening for connections/scan requests.

I don't think so because led of the controller is blinking and often succeeded in connection after resetting esp32.

If there is any other data in the advertisement you could use for detection I would suggest trying to use that

Which is effect information in this case to find any xbox controller? manufacturer and serviceUUID?

Name: Xbox Wireless Controller, Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviceUUID: 0x1812, txPower: 20

By watching logs, I found that some updating is proceeded in scanning.

I NimBLEScan: New advertiser: 44:16:22:5e:b2:d4
I NimBLEScan: Updated advertiser: 44:16:22:5e:b2:d4
not connected at 357771

However, the updating seems not enough and it failed connection often. (In this case, more than 300 seconds (5 minutes).)

not connected at 367271
D NimBLEScan: discovery complete; reason=0
Name: , Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 060000, serviceUUID: 0x1812, txPower: 20
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event

I call pScan->setDuplicateFilter(false) to handle callback with using updated fresh data but it is not called often.

To know reason of not calling callback, I put log to NimBLE and check behavior. https://github.com/asukiaaa/NimBLE-Arduino/commit/fdf048cab8f82286aa8cd58f93ffc337c4be5c83 https://github.com/asukiaaa/NimBLE-Arduino/blob/check-finding/src/NimBLEScan.cpp#L141-L164

Then I found that event type often become 0 after than first finding then callback during scanning is not called. Is it possible to handle event_type 4 more often?

Before calling callback

I NimBLEScan: New advertiser: 44:16:22:5e:b2:d4
I NimBLEScan: advType: 0
I NimBLEScan: event_type: 0
I NimBLEScan: end if
I NimBLEScan: Updated advertiser: 44:16:22:5e:b2:d4
I NimBLEScan: advType: 0
I NimBLEScan: event_type: 4
I NimBLEScan: call on result by else if

Found but finish without calling callback

I NimBLEScan: New advertiser: 44:16:22:5e:b2:d4
I NimBLEScan: advType: 0
I NimBLEScan: event_type: 0
I NimBLEScan: end if

asukiaaa avatar Jul 27 '22 14:07 asukiaaa

I would attempt to leverage these parameters appearance: 964, manufacturer data: 060000, serviceUUID: 0x1812 for detection. These should be in the main advertisement packet, allowing you to disable active scanning which will be more successful.

For explanation, when the device advertises it sends packets very frequently. The scan response is actually a pseudo connection event which requires the advertiser to listen for a scan response request, which it does after each advertisement for only a few milliseconds. This leads to the request possibly not being received or the scanner missing the response if it's window/interval are not matched or currently on the wrong channel. The scan response is usually what contains the name of the device unless there is enough room in the advertisement packet for it, in many cases there is not. So using the name as an identification method has a high probability of failure as you are seeing.

This is precisely what you are seeing when you noticed this:

Then I found that event type often become 0 after than first finding then callback during scanning is not called. Is it possible to handle event_type 4 more often?

event_type 0 is the advertisement, event_type 4 is the scan response. The scan response is not reliable data because it can be missed as described above.

What I would do now is disable active scanning and analyze the data you receive from the controller without that data, then try to use that to identify the device in the callback instead of the name/scan response data.

h2zero avatar Jul 28 '22 01:07 h2zero

Thank you for the advice. I updated code to scan not active and match without name. https://github.com/asukiaaa/arduino-XboxSeriesXControllerESP32/commit/dff11dedb0779a6e60f71fc7b61bf6da0b7bbf2b

However, failing connection rate seems be same as active scan. It cannot connect over 300 seconds for 1 or 2 times on 10 times trial. After waiting 300 seconds, I reset esp32 then it almost succeeded in connection with the controller.

This is log of failing connection over 300 seconds. The log shows updating but it seems not enough to connect.

D NimBLEScan: >> start: duration=5
D NimBLEScan: << start()
not connected at 311783
I NimBLEScan: New advertiser: 44:16:22:5e:b2:d4
I NimBLEScan: advType: 0
I NimBLEScan: event_type: 0
I NimBLEScan: call on result by if
Name: , Address: 44:16:22:5e:b2:d4, appearance: 964, manufacturer data: 0600030080
I NimBLEScan: end if
I NimBLEScan: New advertiser: 1d:32:a1:a8:6a:54
I NimBLEScan: advType: 3
I NimBLEScan: event_type: 3
I NimBLEScan: call on result by if
Name: , Address: 1d:32:a1:a8:6a:54, manufacturer data: 060001092002eb9934ff7135ca9b8d34e370073f5cc5bee50314e361a9
I NimBLEScan: end if
I NimBLEScan: Updated advertiser: 44:16:22:5e:b2:d4
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEScan: >> stop()
D NimBLEScan: << stop()
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
D NimBLEClient: >> connect(44:16:22:5e:b2:d4)
D NimBLEClient: Got Client event 
I NimBLEClient: Connected event
lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2)
D NimBLEClient: Got Client event 
E NimBLEClient: Connection failed; status=574 
not connected at 313969
D NimBLEScan: >> start: duration=5
D NimBLEScan: << start()
not connected at 314475
I NimBLEScan: New advertiser: 44:16:22:5e:b2:d4

I tried 10 seconds for scan time but failing rate seems similar as 4 seconds so I keep 4 seconds.

Do you have some idea to avoid failing connection for more than 300 seconds? https://github.com/asukiaaa/arduino-XboxSeriesXControllerESP32/blob/dff11dedb0779a6e60f71fc7b61bf6da0b7bbf2b/src/XboxSeriesXControllerESP32_asukiaaa.hpp

asukiaaa avatar Jul 28 '22 17:07 asukiaaa

This is a pretty common error and is usually due to the device not responding within the connection interval set.

The message lld_pdu_get_tx_flush_nb HCI packet count mismatch (1, 2) is from the controller and is saying that the packets are out of sync.

I recently came across this explanation on the Nordic forums: https://devzone.nordicsemi.com/f/nordic-q-a/40808/connect-procedure-and-the-disconnect-reason-0x3e-ble_hci_conn_failed_to_be_established

Basically, what you can do is try setting the connection parameters differently and see if you get a more reliable connection.

h2zero avatar Jul 29 '22 00:07 h2zero

Thank you for the comment.

I tried changing connection params but it does not solve blocking connection. I start resetting esp32 when failing connection 3 times then this approach succeeded in connection with controller less than 60 seconds in almost times.

I expect that I can reset synchronization with the controller without resetting esp32 in future.

asukiaaa avatar Aug 01 '22 13:08 asukiaaa

Try removing this function: https://github.com/asukiaaa/arduino-XboxSeriesXControllerESP32/blob/d4706184842e821deaa7708742f8ea31a60025c5/src/XboxSeriesXControllerESP32_asukiaaa.hpp#L63

That will let the controller dictate the connection parameters.

Also I would remove this: https://github.com/asukiaaa/arduino-XboxSeriesXControllerESP32/blob/d4706184842e821deaa7708742f8ea31a60025c5/src/XboxSeriesXControllerESP32_asukiaaa.hpp#L276

The default of 30 seconds is enough

To improve scan results also, you can make these 2 lines the same value: https://github.com/asukiaaa/arduino-XboxSeriesXControllerESP32/blob/d4706184842e821deaa7708742f8ea31a60025c5/src/XboxSeriesXControllerESP32_asukiaaa.hpp#L206

h2zero avatar Aug 01 '22 14:08 h2zero

Thank you for the advice. I updated code but 2 times in 10 times trial cannot connect over than 300 seconds. The approach seems not enough.

asukiaaa avatar Aug 01 '22 15:08 asukiaaa