firmware icon indicating copy to clipboard operation
firmware copied to clipboard

[Bug]: MQTT enabled with custom server/auth causes a reboot loop

Open acidvegas opened this issue 10 months ago • 8 comments

Category

WiFi

Hardware

T-Deck

Firmware Version

2.3.6.7

Description

Enabling wifi and mqtt with a custom mqtt server that uses auth and tls is causing the device to get stuck in a reboot loop.

Relevant log output

No response

acidvegas avatar Apr 22 '24 23:04 acidvegas

@acidvegas Can you observe same things as documented in issue #3496 ?

oherrala avatar Apr 23 '24 10:04 oherrala

@oherrala not I cant observe anything. as soon as I set a custom mqtt server with tls and auth, it just loop reboots over and over

acidvegas avatar Apr 26 '24 17:04 acidvegas

@acidvegas I don't know if there's documentation about debugging this stuff. I've been trying to figure these on my own. A quick guide:

Set device.debug_log_enabled to true (e.g. mestastic --set device.debug_log_enabled true) or from App.

After that you can open serial connection to the device via USB (e.g. screen /dev/tty.usb<something>) or try using the serial monitor on https://flasher.meshtastic.org/.

It should give you a log of what device is doing before it reboots.

oherrala avatar Apr 26 '24 17:04 oherrala

@acidvegas I don't know if there's documentation about debugging this stuff. I've been trying to figure these on my own. A quick guide:

Set device.debug_log_enabled to true (e.g. mestastic --set device.debug_log_enabled true) or from App.

After that you can open serial connection to the device via USB (e.g. screen /dev/tty.usb<something>) or try using the serial monitor on https://flasher.meshtastic.org/.

It should give you a log of what device is doing before it reboots.

The logs from the serial monitor stop at: INFO | 01:11:47 25 [WebServerThread] Attempting to connect directly to MQTT server mqtt.redacted.com, port: 1883, username: acidvegas, password: redacted

So I really can't debug past this point :(

Here is the relevant line of code for that log line, so it must be around here somewhere... https://github.com/meshtastic/firmware/blob/70712d859cdc242b1873fa9b9622cf1c1711fb06/src/mqtt/MQTT.cpp#L330-L354

My mosquito MQTT server is throwing:

714932188: Bad socket read/write on client <unknown>: Invalid arguments provided.
1714932225: New connection from REDACTED_IP:59348 on port 1883.

This is from testing without TLS (cause I didn't know if this was a TLS issue or not...)

acidvegas avatar May 05 '24 01:05 acidvegas

I should also mention the T3-S3's and Heltecs are having this same issue. So it's not even a T-deck related issue.

Is anyone using a custom MQTT server with TLS & auth successfully?

ALSO, you can fix your device by running meshtastic --set mqtt.enabled false RIGHT after it boots. This will stop the reboot loop so you don't have to reflash the device to make it stop.

acidvegas avatar May 05 '24 01:05 acidvegas

I have gotten a connect to my MQTT server (finally, was an ACL issue in moquito), but TLS causes a reboot loop still

acidvegas avatar May 05 '24 19:05 acidvegas

Same here... boot loop with TLS. Then it started working. Unfortunately, I didn't have logcapture in minicom running when it started working at the time. Will see if I can reproduce and get logs.

setup:

  • heltec paper v1.1
  • firmware 2.4.2 beta
  • TLS on port 8883 with let's encrypt certificates

mkgin avatar Aug 30 '24 11:08 mkgin

I should also mention the T3-S3's and Heltecs are having this same issue. So it's not even a T-deck related issue.

Is anyone using a custom MQTT server with TLS & auth successfully?

ALSO, you can fix your device by running meshtastic --set mqtt.enabled false RIGHT after it boots. This will stop the reboot loop so you don't have to reflash the device to make it stop.

Timing of this is challenging... I was unable to get the device into flashing mode anyway... I ended up blocking the device from the wifi network to get the loop to stop.

mkgin avatar Aug 30 '24 12:08 mkgin