SparkFun_RTK_Firmware icon indicating copy to clipboard operation
SparkFun_RTK_Firmware copied to clipboard

RTK Express drops from casting to xmitting and does not automatically reconnect

Open blubbi321 opened this issue 1 year ago • 4 comments

Subject of the issue

Im seeing issues similar to https://github.com/sparkfun/SparkFun_RTK_Firmware/issues/257 :

Setup: RTK Express (as a base) <-> Wifi <-> Mobile Phone Hotspot <-> RTK2Go

Symptom is that the Express randomly drops from casting into xmitting and never recovers back into casting on its own. When manually restarted, it will go into casting again, after surveying in, which eats a lot of time and is not acceptable for our usecase.

Since the setting is very remote, the mobile connection can have quite some delays. This is what I currently blame the problems on.

However, I could reproduce the issue by just shutting down the mobile phone - therefore I would assume the problems at least partly also stem from the RTK Express.

Your workbench

  • What version of RTK firmware are you running? https://github.com/sparkfun/SparkFun_RTK_Firmware_Binaries/blob/main/RTK_Surveyor_Firmware_RC-Jan_19_2024.bin
  • What radios are you using: WiFi; Are you transmitting NTRIP back to the device? No

Steps to reproduce

See above.

I will provide the serial logs ASAP

Expected behavior

I would expect the Express to do incremental reconnects to RTK2Go to avoid getting blocked, but at the same time keep the rtcm stream flowing as much as possible.

Actual behavior

Device drops into xmitting and does not recover

blubbi321 avatar May 09 '24 15:05 blubbi321

We've deployed various firmware fixes for just this use case. I've got a base unit that's been deployed for months that keeps ticking along: USA_CO_Alma1. The satellite internet comes and goes but the base recovers.

I will try to replicate your issue.

nseidle avatar May 10 '24 22:05 nseidle

~~What version of RTK firmware are you running? This can be found when the serial menu is opened (also in the settings.txt file, and in the serial output at power up).~~ My fault. It's there in your post.

nseidle avatar May 20 '24 15:05 nseidle

image

Above, please turn on NTRIP Server debugging before you capture logs: s for system->n for debug network->23 and 24 to enable NTRIP Server messages

    M parser error maxLength: 158 bytes
[  5508][I][BluetoothSerial.cpp:361] esp_spp_cb(): ESP_SPP_START_EVT
[  5985][D][Begin.ino:385] beginSD(): Initializing microSD - using SPI, SdFat and SdFile
[  6063][D][NVM.ino:172] recordSystemSettingsToFileLFS(): Removing LittleFS: /SFE_Express_Settings_0.txt
[  6248][D][NVM.ino:184] recordSystemSettingsToFileLFS(): Settings recorded to LittleFS: /SFE_Express_Settings_0.txt
STATE_BASE_NOT_STARTED --> STATE_BASE_FIXED_NOT_STARTED, 2024-05-20 16:05:13.863
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
STATE_BASE_FIXED_NOT_STARTED --> STATE_BASE_FIXED_TRANSMITTING, 2024-05-20 16:05:14.194
FreeHeap: 80552 / HeapLowestPoint: 68724 / LargestBlock: 69620
NTRIP Server start
NTRIP_SERVER_OFF --> NTRIP_SERVER_ON
NTRIP_SERVER_ON --> NTRIP_SERVER_NETWORK_STARTED
[  7138][D][WiFi.ino:406] wifiStart(): Starting WiFi
[  7201][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 0 - WIFI_READY
[  7204][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 2 - STA_START
[  7205][I][WiFiMulti.cpp:84] addAP(): [WIFI][APlistAdd] add SSID: Roving
Connecting WiFi... [  8791][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 1 - SCAN_DONE
[  8792][I][WiFiMulti.cpp:114] run(): [WIFI] scan done
[  8792][I][WiFiMulti.cpp:119] run(): [WIFI] 26 networks found
[  8797][D][WiFiMulti.cpp:149] run():  --->   0: [1][62:2E:92:25:75:77] Roving (-34) *
[  8805][D][WiFiMulti.cpp:151] run():        1: [11][82:AC:B9:E7:7E:4C] sparkfun-funions (-52) *
[  8814][D][WiFiMulti.cpp:151] run():        2: [11][74:AC:B9:E7:7E:4C] sparkfun-internal (-53) *
[  8822][D][WiFiMulti.cpp:151] run():        3: [11][7E:AC:B9:E7:7E:4C] sparkfun-iot (-54) *
[  8830][D][WiFiMulti.cpp:151] run():        4: [11][7A:AC:B9:E7:7E:4C] sparkfun-guest (-56)
[  8839][D][WiFiMulti.cpp:151] run():        5: [11][7E:A7:B0:ED:EB:4A] DIRECT-0uLexmark MC3326adw-4aeb (-74) *
[  8848][D][WiFiMulti.cpp:151] run():        6: [6][FA:8F:CA:69:68:D2] Den TV (-75)
[  8856][D][WiFiMulti.cpp:151] run():        7: [6][E6:63:DA:28:0A:84] sparkfun-guest (-76)
[  8864][D][WiFiMulti.cpp:151] run():        8: [6][E0:63:DA:28:0A:84] sparkfun-internal (-77) *
[  8873][D][WiFiMulti.cpp:151] run():        9: [6][EA:63:DA:28:0A:84] sparkfun-iot (-77) *
[  8881][D][WiFiMulti.cpp:151] run():        10: [6][EE:63:DA:25:70:7C] sparkfun-funions (-78) *
[  8889][D][WiFiMulti.cpp:151] run():        11: [6][EA:63:DA:25:70:7C] sparkfun-iot (-78) *
[  8897][D][WiFiMulti.cpp:151] run():        12: [6][E0:63:DA:25:70:7C] sparkfun-internal (-79) *
[  8906][D][WiFiMulti.cpp:151] run():        13: [11][60:D0:2C:93:D6:88] NI_IOT (-83) *
[  8914][D][WiFiMulti.cpp:151] run():        14: [1][60:D0:2C:53:DE:98] NI-Guest (-84) *
[  8922][D][WiFiMulti.cpp:151] run():        15: [1][60:D0:2C:93:DE:98] NI_IOT (-84) *
[  8929][D][WiFiMulti.cpp:151] run():        16: [11][60:D0:2C:53:D6:88] NI-Guest (-84) *
[  8937][D][WiFiMulti.cpp:151] run():        17: [11][60:D0:2C:13:D6:88] NI-Wireless (-86) *
[  8945][D][WiFiMulti.cpp:151] run():        18: [11][7E:AC:B9:E1:B0:37] sparkfun-iot (-87) *
[  8953][D][WiFiMulti.cpp:151] run():        19: [11][7A:AC:B9:E1:B0:37] sparkfun-guest (-88)
[  8962][D][WiFiMulti.cpp:151] run():        20: [11][82:AC:B9:E1:B0:37] sparkfun-funions (-89) *
[  8971][D][WiFiMulti.cpp:151] run():        21: [1][AC:8B:A9:DA:31:69] sparkfun-internal (-92) *
[  8979][D][WiFiMulti.cpp:151] run():        22: [6][60:D0:2C:17:60:38] NI-Wireless (-92) *
[  8987][D][WiFiMulti.cpp:151] run():        23: [6][60:D0:2C:97:60:38] NI_IOT (-92) *
[  8995][D][WiFiMulti.cpp:151] run():        24: [1][B2:8B:A9:DA:31:69] sparkfun-guest (-93)
[  9003][D][WiFiMulti.cpp:151] run():        25: [6][60:D0:2C:57:60:38] NI-Guest (-93) *
[  9011][I][WiFiMulti.cpp:160] run(): [WIFI] Connecting BSSID: 62:2E:92:25:75:77 SSID: Roving Channel: 1 (-34)
[  9096][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 4 - STA_CONNECTED
[  9128][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[  9129][D][WiFiGeneric.cpp:880] _eventCallback(): STA IP: 192.168.240.24, MASK: 255.255.255.0, GW: 192.168.240.56
[  9135][I][WiFiMulti.cpp:174] run(): [WIFI] Connecting done.
[  9140][D][WiFiMulti.cpp:175] run(): [WIFI] SSID: Roving
[  9145][D][WiFiMulti.cpp:176] run(): [WIFI] IP: 192.168.240.24
[  9151][D][WiFiMulti.cpp:177] run(): [WIFI] MAC: 62:2E:92:25:75:77
[  9157][D][WiFiMulti.cpp:178] run(): [WIFI] Channel: 1

WiFi IP address: 192.168.240.24, RSSI: -33
FreeHeap: 42368 / HeapLowestPoint: 32936 / LargestBlock: 32244
NTRIP_SERVER_NETWORK_STARTED --> NTRIP_SERVER_NETWORK_CONNECTED
NTRIP_SERVER_NETWORK_CONNECTED --> NTRIP_SERVER_WAIT_GNSS_DATA
NTRIP_SERVER_WAIT_GNSS_DATA --> NTRIP_SERVER_CONNECTING
NTRIP Server connecting to rtk2go.com:2101
NTRIP Server sending authorization credentials
NTRIP_SERVER_CONNECTING --> NTRIP_SERVER_AUTHORIZATION
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Server Response: ICY 200 OK

NTRIP Server connected to rtk2go.com:2101 bldr_dwntwn2
NTRIP_SERVER_AUTHORIZATION --> NTRIP_SERVER_CASTING
[ 19763][D][WiFiClient.cpp:528] connected(): Disconnected: RES: 0, ERR: 128
Connection to NTRIP Caster was lost
FreeHeap: 43224 / HeapLowestPoint: 30464 / LargestBlock: 33780
NTRIP_SERVER_CASTING --> NTRIP_SERVER_ON
NTRIP Server Connecting - rtk2go.com/bldr_dwntwn2:2101 Last Uptime: 0 00:00:00.000 (Reconnects: 1)
NTRIP Server trying again in 15 seconds.
NTRIP_SERVER_ON --> NTRIP_SERVER_NETWORK_STARTED
FreeHeap: 43336 / HeapLowestPoint: 30464 / LargestBlock: 33780
NTRIP_SERVER_NETWORK_STARTED --> NTRIP_SERVER_NETWORK_CONNECTED
NTRIP_SERVER_NETWORK_CONNECTED --> NTRIP_SERVER_WAIT_GNSS_DATA
NTRIP_SERVER_WAIT_GNSS_DATA --> NTRIP_SERVER_CONNECTING
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
RTCM parser error maxLength: 173 bytes
NTRIP Server connecting to rtk2go.com:2101
NTRIP Server sending authorization credentials
NTRIP_SERVER_CONNECTING --> NTRIP_SERVER_AUTHORIZATION
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
FreeHeap: 43224 / HeapLowestPoint: 30464 / LargestBlock: 33780
NTRIP_SERVER_AUTHORIZATION --> NTRIP_SERVER_ON
NTRIP Server Connecting - rtk2go.com/bldr_dwntwn2:2101 Last Uptime: 0 00:00:00.000 (Reconnects: 2)
NTRIP Server trying again in 30 seconds.
NTRIP_SERVER_ON --> NTRIP_SERVER_NETWORK_STARTED
FreeHeap: 43696 / HeapLowestPoint: 30464 / LargestBlock: 33780
NTRIP_SERVER_NETWORK_STARTED --> NTRIP_SERVER_NETWORK_CONNECTED
NTRIP_SERVER_NETWORK_CONNECTED --> NTRIP_SERVER_WAIT_GNSS_DATA
NTRIP_SERVER_WAIT_GNSS_DATA --> NTRIP_SERVER_CONNECTING
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
NTRIP Server connecting to rtk2go.com:2101
NTRIP Server sending authorization credentials
NTRIP_SERVER_CONNECTING --> NTRIP_SERVER_AUTHORIZATION
Server Response: ICY 200 OK

NTRIP Server connected to rtk2go.com:2101 bldr_dwntwn2
NTRIP_SERVER_AUTHORIZATION --> NTRIP_SERVER_CASTING
2024-05-20 16:06:23.661
    Tx RTCM: 2024-05-20 16:06:23.661, 25 bytes sent
NTRIP Server transmitted 25 RTCM bytes to Caster
    Tx RTCM: 2024-05-20 16:06:23.754, 128 bytes sent
    Tx RTCM: 2024-05-20 16:06:23.846, 128 bytes sent
    Tx RTCM: 2024-05-20 16:06:23.940, 128 bytes sent
    Tx RTCM: 2024-05-20 16:06:24.032, 128 bytes sent
    Tx RTCM: 2024-05-20 16:06:24.063, 34 bytes sent

In the test above, the RTK_Surveyor_Firmware_RC-Jan_19_2024.bin is setup as a fixed base (this is to speed up testing). You can see it connects to WiFi, fails, then re-tries and gets connected to the caster and starts sending packets.

   Tx RTCM: 2024-05-20 16:10:38.553, 25 bytes sent
NTRIP Server transmitted 571 RTCM bytes to Caster
2024-05-20 16:10:38.648
    Tx RTCM: 2024-05-20 16:10:38.648, 128 bytes sent
    Tx RTCM: 2024-05-20 16:10:38.741, 128 bytes sent
    Tx RTCM: 2024-05-20 16:10:38.790, 60 bytes sent
    Tx RTCM: 2024-05-20 16:10:38.845, 68 bytes sent
    Tx RTCM: 2024-05-20 16:10:38.938, 128 bytes sent
    Tx RTCM: 2024-05-20 16:10:38.967, 34 bytes sent
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
[158659][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 5 - STA_DISCONNECTED
[158660][W][WiFiGeneric.cpp:852] _eventCallback(): Reason: 1 - UNSPECIFIED
WiFi link lost
[158669][D][WiFiClient.cpp:528] connected(): Disconnected: RES: -1, ERR: 113
FreeHeap: 44260 / HeapLowestPoint: 28828 / LargestBlock: 33780
NTRIP_SERVER_CASTING --> NTRIP_SERVER_ON
NTRIP Server Connecting - rtk2go.com/bldr_dwntwn2:2101 Last Uptime: 0 00:00:29.882 (Reconnects: 3)
NTRIP Server trying again in 60 seconds.
[158703][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 3 - STA_STOP
[158704][D][WiFi.ino:457] wifiShutdown(): WiFi Stopped
Network delaying 7 seconds before connection
NTRIP_SERVER_ON --> NTRIP_SERVER_NETWORK_STARTED
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
[166250][D][WiFi.ino:406] wifiStart(): Starting WiFi
[166321][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 0 - WIFI_READY
[166323][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 2 - STA_START
[166324][I][WiFiMulti.cpp:84] addAP(): [WIFI][APlistAdd] add SSID: Roving
Connecting WiFi... [167911][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 1 - SCAN_DONE
[167911][I][WiFiMulti.cpp:114] run(): [WIFI] scan done
[167911][I][WiFiMulti.cpp:119] run(): [WIFI] 29 networks found
[167917][D][WiFiMulti.cpp:151] run():        0: [11][7A:AC:B9:E7:7E:4C] sparkfun-guest (-51)
[167925][D][WiFiMulti.cpp:151] run():        1: [11][74:AC:B9:E7:7E:4C] sparkfun-internal (-52) *

In the test above, the hotspot is turned off. RTK device attempts to reconnect and fails. NTRIP Server trying again in 60 seconds.

WiFi IP address: 192.168.240.24, RSSI: -30
FreeHeap: 43300 / HeapLowestPoint: 28828 / LargestBlock: 33780
NTRIP_SERVER_NETWORK_STARTED --> NTRIP_SERVER_NETWORK_CONNECTED
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
NTRIP_SERVER_NETWORK_CONNECTED --> NTRIP_SERVER_WAIT_GNSS_DATA
NTRIP_SERVER_WAIT_GNSS_DATA --> NTRIP_SERVER_CONNECTING
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
NTRIP Server connecting to rtk2go.com:2101
NTRIP Server sending authorization credentials
NTRIP_SERVER_CONNECTING --> NTRIP_SERVER_AUTHORIZATION
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
FreeHeap: 42644 / HeapLowestPoint: 28828 / LargestBlock: 32756
NTRIP_SERVER_AUTHORIZATION --> NTRIP_SERVER_ON
NTRIP Server Connecting - rtk2go.com/bldr_dwntwn2:2101 Last Uptime: 0 00:00:29.882 (Reconnects: 4)
NTRIP Server trying again in 2 minutes.
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
NTRIP_SERVER_ON --> NTRIP_SERVER_NETWORK_STARTED
FreeHeap: 43116 / HeapLowestPoint: 28828 / LargestBlock: 33780
NTRIP_SERVER_NETWORK_STARTED --> NTRIP_SERVER_NETWORK_CONNECTED
NTRIP_SERVER_NETWORK_CONNECTED --> NTRIP_SERVER_WAIT_GNSS_DATA
NTRIP_SERVER_WAIT_GNSS_DATA --> NTRIP_SERVER_CONNECTING

60 seconds later, device fails and tries again in NTRIP Server trying again in 2 minutes.

Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green NTRIP Server connecting to rtk2go.com:2101 NTRIP Server sending authorization credentials NTRIP_SERVER_CONNECTING --> NTRIP_SERVER_AUTHORIZATION Server Response: ICY 200 OK

NTRIP Server connected to rtk2go.com:2101 bldr_dwntwn2
NTRIP_SERVER_AUTHORIZATION --> NTRIP_SERVER_CASTING
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
2024-05-20 16:13:50.474
    Tx RTCM: 2024-05-20 16:13:50.474, 25 bytes sent
NTRIP Server transmitted 571 RTCM bytes to Caster
    Tx RTCM: 2024-05-20 16:13:50.591, 128 bytes sent
    Tx RTCM: 2024-05-20 16:13:50.683, 128 bytes sent
    Tx RTCM: 2024-05-20 16:13:50.777, 128 bytes sent
    Tx RTCM: 2024-05-20 16:13:50.960, 162 bytes sent
    Tx RTCM: 2024-05-20 16:13:51.461, 25 bytes sent
NTRIP Server transmitted 571 RTCM bytes to Caster
2024-05-20 16:13:51.564
    Tx RTCM: 2024-05-20 16:13:51.564, 128 bytes sent

Above, 2 minutes later it connects.

[387171][D][WiFiClient.cpp:528] connected(): Disconnected: RES: -1, ERR: 113
    Tx RTCM: 2024-05-20 16:14:27.574, 0 bytes sent
    Tx RTCM: 2024-05-20 16:14:27.584, 0 bytes sent
    Tx RTCM: 2024-05-20 16:14:27.597, 0 bytes sent
    Tx RTCM: 2024-05-20 16:14:27.610, 0 bytes sent
    Tx RTCM: 2024-05-20 16:14:27.624, 0 bytes sent
    Tx RTCM: 2024-05-20 16:14:27.638, 0 bytes sent
    Tx RTCM: 2024-05-20 16:14:27.646, 0 bytes sent
WiFi link lost
FreeHeap: 44028 / HeapLowestPoint: 27416 / LargestBlock: 32756
NTRIP_SERVER_CASTING --> NTRIP_SERVER_ON
NTRIP Server Connecting - rtk2go.com/bldr_dwntwn2:2101 Last Uptime: 0 00:00:37.995 (Reconnects: 5)
NTRIP Server trying again in 5 minutes.
[387268][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 3 - STA_STOP
[387271][D][WiFi.ino:457] wifiShutdown(): WiFi Stopped
Network delaying 7 seconds before connection
NTRIP_SERVER_ON --> NTRIP_SERVER_NETWORK_STARTED
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
Batt (82%): Voltage: 4.05V Charging: 0.00%/hr Green
[394789][D][WiFi.ino:406] wifiStart(): Starting WiFi
[394860][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 0 - WIFI_READY
[394862][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 2 - STA_START
[394863][I][WiFiMulti.cpp:84] addAP(): [WIFI][APlistAdd] add SSID: Roving
Connecting WiFi... [396450][D][WiFiGeneric.cpp:831] _eventCallback(): Arduino Event: 1 - SCAN_DONE
[396451][I][WiFiMulti.cpp:114] run(): [WIFI] scan done
[396451][I][WiFiMulti.cpp:119] run(): [WIFI] 20 networks found
[396456][D][WiFiMulti.cpp:151] run():        0: [11][74:AC:B9:E7:7E:4C] sparkfun-internal (-50) *

Above, when we turn off the WiFi a second time we see the NTRIP Server trying again in 5 minutes..

This is all expected behavior but the NTRIP connection throttling is a bit aggressive with its timeout increases. The NTRIP Server's timeout could be reset once a successful WiFi connection is made.

nseidle avatar May 20 '24 16:05 nseidle

Throttling is reset to 15 seconds between reconnect attempts after 5 minutes of a successful connection. We could expose the NTRIP_SERVER_CONNECTION_TIME setting to the user for changing. Would that help? You could then reduce this to say, 15 seconds of successful connection time would reset the throttling to 15 seconds.

Other options: You could record the base stations location and change it to a fixed location base setup, removing the 60 second survey-in time.

I think the crux of the issue is: why is the RTK device disconnect from the Caster? Is it that WiFi is being lost? Or is the caster hanging up on the RTK device for some reason? This will be hard to troubleshoot further without logs.

nseidle avatar May 20 '24 16:05 nseidle

Closing... Please reopen if you need more help with this issue.

Best wishes, Paul

PaulZC avatar Jun 13 '24 10:06 PaulZC