OpenBikeSensorFirmware icon indicating copy to clipboard operation
OpenBikeSensorFirmware copied to clipboard

Startup fails due to GPS module communication

Open maehw opened this issue 1 year ago • 6 comments

Hi all!

As I've taken my notes in English (and everything is becoming quite technical), I am adding his as an issue to github, not the the OBS forum:

We are currently trying to build some OBS HW v00.03.12. The flashed firmware version is OBS/v0.18.849.

As we've previously seen GPS modules failing, we've been testing the GPS modules in advance: we've hooked the modules up to a Windows PC using a USB/serial converter cable and running the u-blox u-center GNSS evaluation software (using a small test jig). Everything worked fine - all the GPS modules all got fixes with their antennas were attached, at least 4 satellites in sight, quick fix!

When being soldered on to the OBS PCB, the OBS won't boot up correctly and "freeze" showing "0 sats SN:0" in the display.

I've made two approaches to narrow the issue down:

  1. Use a Saleae Logic analyzer, sniff and analyze (using a UBX HLA) the traffic between OBS and GPS module ; no ESP32 UART connection to the PC; powering the OBS by operating the switch
  2. Have a look at the ESP32 UART logging data and also sniff the data; the OBS power switch doesn't play any role here

Approach 1

Initialization: agreement on a common baud rate

  • Around 55 ms after powerup, the GPS module sends a lot of NMEA data at 9600 baud (multiple $TDINF messages, $GNRMC, $GNGGA, 2x $GNGSA, 3x $GPGSV, $BDGSV, $GNTXT including the text ANTENNA OK)
  • After another approx. 540 ms, the GPS module sends some more NMEA messages
  • After this burst, another approx. 670 ms pass, until the OBS starts to send some data itself: it queries CFG-RINV at 115'200 baud and doesn't get an answer after 200 ms twice
  • OBS then switches to 9600 baud and sends CFG-PRT for reconfiguration to 115'200 baud, directly switches itself back to 115'200 baud and re-sends CFG-RINV at the higher baud rate while the GPS still sends using the lower 9600 baud rate
  • GPS reconfigures to using 115'200 baud and sends an ACK-ACK for the CFG-PRT message at that higher rate

Continued failed queries of the remote inventory

  • After around 200 ms the OBS resends the CFG-RINV query as it has not gotten a response in the meantime
  • After another approx. 200 ms the OBS again resends the CFG-RINV query as it has not gotten a response in the meantime
  • The GPS doesn't seem to care and sends some $GNRMC, $GNGGA, $GNGSA, $GPGSV, $BDGSV, $GNTXT NMEA message
  • After another approx. 200 ms the OBS again resends the CFG-RINV query as it has not gotten a response in the meantime - for two more times!
  • OBS finally issues a CFG-RST request
  • GPS acknowledges the request by responding with an ACK-ACK for CFG-RST

Failed message configurations after reset and still no remote inventory

  • After some 85 milliseconds delay, again NMEA data from the GPS module
  • OBS sends an CFG-MSG request for message 0x0B 0x32, i.e. AID-ALPSRV
  • GPS replies with ACK-NACK
  • OBS requests AID-ALP, MON-VER, MON-HW, NAV-STATUS and CFG-NAV5
  • GPS replies with 37(! seems not to be specification conformant) byte long MON-VER
  • GPS replies with 36 byte long CFG-NAV5
  • After another approx. 200 ms the OBS sends another CNFG-NAV5 request
  • GPS replies with CFG-NAV5
  • After a short delay, the GPS also sends again some $GNRMC, $GNGGA, $GNGSA, $GPGSV, $BDGSV, $GNTXT NMEA message
  • OBS once again requests CFG-RINV
  • No reply from the GPS module within approx. 200 ms so OBS once again requests CFG-RINV
  • Still no reply from the GPS module within about 240 ms
  • OBS requests CFG-MSG for 0x01 0x20, i.e. NAV-TIMEGPS with a rate of 1 on the relevant port
  • ACK-NACK from the GPS module
  • OBS requests CFG-MSG for 0x01 0x03, i.e. NAV-STATUS with a rate of 1 on the relevant port
  • ACK-NACK from the GPS module
  • OBS requests CFG-MSG for 0x0A 0x09, i.e. MON-HW with a rate of 1 on the relevant port
  • ACK-NACK from the GPS module
  • Afterwards, cyclic NMEA message burst from the GPS module continue - no UBX messages, no more data from the OBS

Findings

  • The response of MON-VER doesn't look as if it was conforming to the specification (I'd expect a 40 byte long reply with 30 bytes software version and 10 bytes hardware version); the value starts with T3,RomFw,1.1(48), not sure if this is what one would expect from an original u-blocks module!
  • Why no CFG-RINV reply at all? Nor an empty dump neither even an ACK-NACK (not sure if we expect any ACK-* though)?
  • Why the ACK-NACK to the CFG-MSG requests?
  • In my understanding either OBS does not request muting NMEA messages or it does not work.

Approach 2

E (933) esp_core_dump_flash: No core dump partition found!
[    36][I][OpenBikeSensorFirmware.cpp:199] setup(): openbikesensor.org - OBS/v0.18.849
[    38][I][esp32-hal-i2c.c:75] i2cInit(): Initialising I2C Master: sda=21 scl=22 freq=100000
[    43][W][Wire.cpp:301] begin(): Bus already started in Master Mode.
[    89][I][VoltageMeter.cpp:40] VoltageMeter(): Initializing VoltageMeter.
[    90][I][VoltageMeter.cpp:54] VoltageMeter(): Characterized using eFuse Vref
[    92][I][VoltageMeter.cpp:62] VoltageMeter(): eFuse Two Point: NOT supported
[    99][I][VoltageMeter.cpp:66] VoltageMeter(): eFuse Vref: Supported
[   109][I][VoltageMeter.cpp:75] VoltageMeter(): VoltageMeter initialized got 0.21V.
[   155][I][OpenBikeSensorFirmware.cpp:617] loadConfig(): Load cfg
[  1058][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1259][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1259][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 401ms
[  1370][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNRMC: $GNRMC<snip-to-end-of-line>
[  1371][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNGGA: $GNGGA<snip-to-end-of-line>
[  1380][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNGSA: $GNGSA,A,1,,,,,,,,,,,,,99.9,99.9,99.9,1*0A
[  1388][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNGSA: $GNGSA,A,1,,,,,,,,,,,,,99.9,99.9,99.9,4*0F
[  1398][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1409][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1420][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1431][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GPGSV: $GPGSV,<snip-to-end-of-line>
[  1440][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA BDGSV: $BDGSV,<snip-to-end-of-line>
[  1447][W][gps.cpp:1156] parseNmeaMessage(): Unparsed NMEA GNTXT: $GNTXT,1,1,01,ANTENNA OK*2B
[  1492][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1692][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  1692][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 400ms
[  1696][E][gps.cpp:305] setBaud(): Switch to 115200 was not possible, back to 9600.
[  1903][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  2103][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  2103][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 400ms
[  2107][E][gps.cpp:310] setBaud(): NO GPS????
[  7112][I][gps.cpp:450] addStatisticsMessage(): New: readGPSData(clear: 190 bytes in buffer, lastCall 5009ms ago, at 1970-01-01T00:00:07)
[  7315][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  7398][W][gps.cpp:663] encode(): Unexpected GPS char in state null: c7 Ç
...
[  7631][W][gps.cpp:663] encode(): Unexpected GPS char in state null: fe þ
[  7639][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 324ms.
[  7645][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 534ms
[  7853][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  8053][W][gps.cpp:356] sendAndWaitForAck(): Retry to send 0x3406 after 200ms.
[  8053][E][gps.cpp:361] sendAndWaitForAck(): Failed to send cfg. 0x3406 NAK: 0 after 400ms
[  8057][I][gps.cpp:180] softResetGps(): Soft-RESET GPS!
[  8232][W][gps.cpp:663] encode(): Unexpected GPS char in state null: c7 Ç
...
[  8247][W][gps.cpp:663] encode(): Unexpected GPS char in state null: 88 ˆ
...

I haven't put the same amount of effort into analysing this situation. But after start-up it looks like the GPS sending at 115'200 baud and the OBS having fallen back to 9600 baud - so the cyclic NMEA messages may not even be interpretable as ASCII/NMEA messages. All UART input is garbage this makes everything fail?!

Summary

I am not sure what's going on but I can provide further debugging details when I get some guidance.

Random thoughts:

  • Is it possible that u-blox u-center activated NMEA messages that the OBS firmware cannot cope with as they are unexpected?
  • Some corner case where baud rates are changed without being required?
  • It looks to me that the OBS does not wait for neither the reset request nor the baudrate reconfiguration being acknowledged by the GPS module before continuing communication.
  • Counterfeit GPS module that is not specification-conformant - works with the u-blox u-center but not the OBS firmware?
  • Is missing almanach data problematic? Didn't provide any so far. If that's the case, I suggest the firmware handling this situation differently if it can be recognized.
  • Is the remote inventory or the access to it broken and causing all those issues?

I haven't dived too much into the firmware.

Your help is highly appreciated.


Edit:

I played around with the u-blox u-center and found out about the NMEA messages:

  • GxGGA seems to stand for Global Positioning System Fix Data
  • GxGSA GNSS DOP and Active Stallites
  • GxGSV GNSS Satellites in View
  • GxRMC Recommended Minimum Specific GNSS Data
  • GxTXT Text Transmission
  • BDGSV still seems like random unknown magic

Maybe they cannot be deactivated and are even required by the OBS? Or did u-blox u-center just activate them because of its opened views and store those settings on the module? E.g. GxGGA "SVs Used" or GxGSA "SVs Used": "Number of SVs used for Navigation"

Could OBS display the min./max. C/N_0 of the 4 satellites with the best C/N_0 instead of some magic (absolute?) noisePerMs noise level from MON-HW? Maybe the latter correlates with N_0 but doesn't give any info about signal reception quality?


Edit2:

Okay, does not look like any NMEA relations:

  • Gps::parseUbxMessage() calls mIncomingGpsRecord.setInfo(mGpsBuffer.navSol.numSv, ...) when it receives UBX NAV-SOL messages.
  • GpsRecord::setInfo(uint8_t satellitesInUse, GPS_FIX gpsFix, uint8_t flags) stores the argument in member mSatellitesUsed
  • Gps::showWaitStatus() prints String(mCurrentGpsRecord.mSatellitesUsed) + "sats SN:" + String(mLastNoiseLevel); on the display

From a static code analysis (not runtime debugging) view, we're stuck in the following endless loop:

while (!gps.hasFix(obsDisplay)) {
    currentTimeMillis = millis();
    gps.handle();
    // ...
    gps.showWaitStatus(obsDisplay);
    if (button.read() == HIGH) {
      log_d("Skipped get GPS...");
      obsDisplay->showTextOnGrid(2, obsDisplay->currentLine(), "...skipped");
      break;
    }
  }
  • gps.hasFix() uses GpsRecord::hasValidFix() (https://github.com/openbikesensor/OpenBikeSensorFirmware/blob/53e9ea4440d13a7292031f45227c5144bc04e8e1/src/gpsrecord.cpp#L118..L121) which includes && mSatellitesUsed != 0 as condition
  • I think we won't be able to break outside of this loop without receiving a NAV-SOL message which never has been requested (according to my log above)

maehw avatar May 05 '23 20:05 maehw