[Bug]: NRF52840 - Random crash on packet processing
Category
Hardware Compatibility
Hardware
Other
Is this bug report about any UI component firmware like InkHUD or Meshtatic UI (MUI)?
- [ ] Meshtastic UI aka MUI colorTFT
- [ ] InkHUD ePaper
- [ ] OLED slide UI on any display
Firmware Version
2.7.14.50f9be9a2
Description
Hello, team. Caught the crash.
DEBUG | ??:??:?? 13536 [RadioIf] Raw incoming packet: 68 84 5d 33 28 63 65 ba d0 f9 86 dd ef 08 00 28 h.]3(ce........(
DEBUG | ??:??:?? 13536 [RadioIf] Raw incoming packet: 4e 67 6a 3f 91 c2 50 bb 21 a6 ed 4d 5e b7 50 72 Ngj?..P.!..M^.Pr
DEBUG | ??:??:?? 13536 [RadioIf] Raw incoming packet: c1 5e 2c 56 ab a5 d2 e3 e4 ed 26 89 1e 55 63 4f .^,V......&..UcO
DEBUG | ??:??:?? 13536 [RadioIf] Raw incoming packet: f7 01 82 3c 31 22 ...<1"
DEBUG | ??:??:?? 13536 [RadioIf] Corrected frequency offset: -4.843750
DEBUG | ??:??:?? 13536 [RadioIf] Lora RX
0xdd86f9d0:0xba656328->0x335d8468 tr: 0 WantAck=1 HL=7 HS=7 Ch=0x8 ENC len=54 SNR:6.5 RSSI:-31 Rly:0x28
DEBUG | ??:??:?? 13536 [RadioIf] Packet RX: 641ms
INFO | ??:??:?? 13536 [Router] Packet History - insert: Using new slot @uptime 13536.484s TRACE NEW
DEBUG | ??:??:?? 13536 [Router] Use channel 0 (hash 0x8)
DEBUG | ??:??:?? 13536 [Router] Expand short PSK #1
DEBUG��@INFO | ??:??:?? 2
//\ E S H T /\ S T / C
DEBUG | ??:??:?? 2 Filesystem files:
If replay the same packet it will not crash:
DEBUG | ??:??:?? 1106 [RadioIf] Raw incoming packet: 68 84 5d 33 28 63 65 ba d0 f9 86 dd ef 08 00 28 h.]3(ce........(
DEBUG | ??:??:?? 1106 [RadioIf] Raw incoming packet: 4e 67 6a 3f 91 c2 50 bb 21 a6 ed 4d 5e b7 50 72 Ngj?..P.!..M^.Pr
DEBUG | ??:??:?? 1106 [RadioIf] Raw incoming packet: c1 5e 2c 56 ab a5 d2 e3 e4 ed 26 89 1e 55 63 4f .^,V......&..UcO
DEBUG | ??:??:?? 1106 [RadioIf] Raw incoming packet: f7 01 82 3c 31 22 ...<1"
DEBUG | ??:??:?? 1106 [RadioIf] Corrected frequency offset: -148.218750
DEBUG | ??:??:?? 1106 [RadioIf] Lora RX
0xdd86f9d0:0xba656328->0x335d8468 tr: 0 WantAck=1 HL=7 HS=7 Ch=0x8 ENC len=54 SNR:6.25 RSSI:-21 Rly:0x28
DEBUG | ??:??:?? 1106 [RadioIf] Packet RX: 641ms
INFO | ??:??:?? 1106 [Router] Packet History - insert: Using new slot @uptime 1106.044s TRACE NEW
DEBUG | ??:??:?? 1106 [Router] Use channel 0 (hash 0x8)
DEBUG | ??:??:?? 1106 [Router] Expand short PSK #1
DEBUG | ??:??:?? 1106 [Router] Use AES128 key!
DEBUG | ??:??:?? 1106 [Router] decoded message
0xdd86f9d0:0xba656328->0x335d8468 tr: 1 WantAck=1 HL=7 HS=7 Ch=0x0 Portnum:4 SNR:6.25 RSSI:-21 Rly:0x28
Any ideas?
Relevant log output
Caught one more
DEBUG | ??:??:?? 369 [RadioIf] Raw incoming packet: 55 f6 57 21 28 63 65 ba 1a 3a 24 48 ef 08 00 28 U.W!(ce..:$H...(
DEBUG | ??:??:?? 369 [RadioIf] Raw incoming packet: b6 17 ae 87 84 ba f7 ea ab 11 35 cf 96 08 6e 04 ..........5...n.
DEBUG | ??:??:?? 369 [RadioIf] Raw incoming packet: 83 4a a0 9c a9 15 24 99 50 a4 7b 95 bd 97 43 40 .J....$.P.{...C@
DEBUG | ??:??:?? 369 [RadioIf] Raw incoming packet: f6 55 15 7f .U..
DEBUG | ??:??:?? 369 [RadioIf] Corrected frequency offset: -31.968748
DEBUG | ??:??:?? 369 [RadioIf] Lora RX
0x48243a1a:0xba656328->0x2157f655 tr: 0 WantAck=1 HL=7 HS=7 Ch=0x8 ENC len=52 SNR:5.75 RSSI:-33 Rly:0x28
DEBUG | ??:??:?? 369 [RadioIf] Packet RX: 641ms
��@INFO | ??:??:?? 2
//\ E S H T /\ S T / C
The next lines could be
LOG_DEBUG("Packet RX: %ums", airtime_ms);
# or
LOG_DEBUG("Packet RX (noise?) : %ums", airtime_ms);
Does the crash also still occur in firmware 2.7.15? That is the current beta version, i.e. out of alpha stage.
Does the crash also still occur in firmware 2.7.15? That is the current beta version, i.e. out of alpha stage.
Flashed with 2.7.15.567b8ea. Will see.
https://github.com/meshtastic/firmware/blob/8fe98db5dd6738546db0d27c6823e3380df322d4/src/mesh/Router.cpp#L430-L432
This uses nodeDB->getMeshNode(p->to)->user.public_key.size without first ensuring getMeshNode(p->to) is non-NULL; if the "to" node isn't present in the DB we dereference nullptr and crash.
@compumike Do you agree? I'm confused because if that's the case I wonder why it wouldn't happen all the time
Edit: Oh wait, the "to" node is us 😄 Could it happen that the NodeDB does not contain the local node (yet)?
Hi @FFAMax how is it coming along with firmware 2.7.15? Does the error on decode still occur?
Hi @FFAMax how is it coming along with firmware 2.7.15? Does the error on decode still occur?
Testing interrupted. Restarted. Monitoring in progress.
@shalberd uptime 48h
Uptime 98h.
Uptime 6 days. Test finished.
The original issue somehow related to LOG_DEBUG of the incoming packets when dumping them to terminal. Not 100% sure but if comment those lines - device is more stable.