Add periodic reboot/reboot on wifi disconnection option.
Is your feature request related to a problem? Please describe. Device goes offline randomy after at least few hours. Disconnects from MQTT and can't load web interface too. Solution is to power cycle the device.
Describe the solution you'd like Option to have device restarted at periodic intervals. And/or a device to reboot after it has disconnected from the network.
Describe alternatives you've considered N/A
Additional context N/A
I also have problems with my ESP32 disconnecting and not coming online again. The reboot would be a welcome workaround but an actual solution would also be appreciated.
Hey,
Do either of you have any serial logs to share when the device becomes unresponsive? Preferably with debug logs enabled. It should give some insight as to what's going on.
Thanks!
Thanks for noticing this. I have only remote logging set up. The BLE device I am monitoring is very chatty. I use the default bin and see lots of Bluetooth message but nothing else.
@shmuelzon perhaps you could provide a bin where less/no bluetooth messages are being logged. I would try to catch the ESP32 module's log output on a Raspberry Pi or an ESP8266. From what my research shows the ESP32 not reconnecting seems to be a more common issue.
Hey,
@giddyhup Please find attached binaries with some log messages removed. One's with the default INFO level debugs while the other is DEBUG. I've removed some of my debug messages but you'll probably still see some from the other modules.
ble2mqtt.debug.bin.gz ble2mqtt.info.bin.gz
I'm interested in what you've found regarding the ESP32 not reconnecting. Anything concrete for the root cause or any workarounds?
@shmuelzon thanks for the binaries. It may take some time for me to work with them but I will report back.
Regarding re-connection issues. I noticed this comment in wifi.c:
This is a workaround as ESP32 WiFi libs don't currently auto-reassociate.
I did first what could be done without much infrastructure change. I remote-flashed the DEBUG image and collected the log remotely. The flash procedure ended with a warm reboot. I used remote logging and after about a minute encountered a disconnect. In the past I experienced disconnects only about once in 12 hours so I was either 'lucky' or something was triggered (wild idea: could it be DHCP renewal?).
The log showed no abnormalities. The last entry is the discovery of a BLE device that the module did not connect to (I only have one address whitelisted). After a forced manual reset via the module's reset button the messages appearing in the remote log still show no abnormalities. There is just some MQTT connection and subscribe chatter and listing of BLE devices.
As I feared, I really have to find a solution for direct logging from a Raspberry Pi to get what is going on when the module disconnects.
Unfortunately, you're right. Debugging WiFi issues can't be done over WiFi. This is a reoccurring problem and some ESP-IDF versions seem more stable than others. To the best of my knowledge I'm not doing anything special with the way I handle WiFi but I know that using it in conjunction with Bluetooth causes issues. I have other projects that use the exact same WiFi code (but don't include Bluetooth) that remain connected for weeks without issues... I've considered trying to switch from Bluedroid to nimBLE but I don't know if it will make any difference since both use the same PHY.
Please find attached a redacted excerpt where I de-authed the module on purpose to see what happens.
A panic happened, the module rebooted and reconnected. So here, the result at least was desirable. I had remote logging still enabled but disabled it now for the next tests. esp32ble.log
Thanks for that. It looks like we've run or of memory (?) But, it might be related to the fact that debug logs are enabled, it takes its toll...
Apparently serial logging fixes the reconnection issues. : )
The module is online since I started logging. I'll keep logging.
Apparently serial logging fixes the reconnection issues. : )
The module is online since I started logging. I'll keep logging.
How do I enable this accidental fix?
I was joking, but if you want to try it yourself here's what I did:
I flashed (OTA) ble2mqtt.debug.bin.gz and connected my ESP32 module via its USB port to a Raspberry Pi. On the Pi I use minicom to collect the module's serial output and write it to a file.
BTW - Not sure if relevant to your setup, I once had an ESP32 connected to an RPI2 (I think) and it would often fail and get stuck in a boot loop but when I moved it to another machine (with USB3) it didn't happen again. I'm guessing it might be related to the amount of current the USB was able to provide.
@shmuelzon, thanks for pointing that out.
Initially, my ESP32 was connected to a USB charging brick that provides 5V/1A or more. Currently, it is connected to a Pi 4's USB 2 port. So I guess it is currently getting less juice while before (where it had issues reconnecting). Perhaps later will connect it to the charger again and see if failing reconnects reoccur.
@shmuelzon I was offline for a few days and am now back to continue with this issue.
I downgraded to the official version (16) and can confirm that the issue returns. In the log these entries keep repeating:
I (87334265) WiFi: Disconnected
I (87334265) BLE2MQTT: Disconnected from the network, stopping MQTT
I (87334265) MQTT: Disconnecting MQTT client
W (87334265) wifi:m f probe req l=0
W (87334395) wifi:m f probe req l=0
W (87334515) wifi:m f probe req l=0
W (87334635) wifi:m f probe req l=0
W (87334755) wifi:m f probe req l=0
W (87334875) wifi:m f probe req l=0
W (87334995) wifi:m f probe req l=0
W (87335125) wifi:m f probe req l=0
W (87335245) wifi:m f probe req l=0
W (87335365) wifi:m f probe req l=0
W (87335485) wifi:m f probe req l=0
I (87335845) WiFi: Disconnected
I (87335845) BLE2MQTT: Disconnected from the network, stopping MQTT
I (87335845) MQTT: Disconnecting MQTT client
W (87335845) wifi:m f probe req l=0
I am using the debug version again and a few minutes ago the BT connection to the (single whitelisted) device I am listening to got disconnected. Initially, when I checked the web interface the request appeared in the log but during handling the web request the module disconnected:
D (12974347) httpd: httpd_server: processing listen socket 54
D (12974347) httpd: httpd_accept_conn: newfd = 57
D (12974347) httpd_sess: httpd_sess_new: fd = 57
D (12974347) httpd_sess: httpd_sess_new: active sockets: 1
D (12974357) httpd: httpd_accept_conn: complete
D (12974357) httpd: httpd_server: doing select maxfd+1 = 58
D (12974367) httpd: httpd_process_session: processing socket 57
D (12974367) httpd_sess: httpd_sess_process: httpd_req_new
D (12974377) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974387) httpd_txrx: httpd_recv_with_opt: received length = 128
D (12974387) httpd_parse: read_block: received HTTP request block size = 128
D (12974397) httpd_parse: cb_url: message begin
D (12974397) httpd_parse: cb_url: processing url = /
D (12974407) httpd_parse: verify_url: received URI = /
D (12974407) httpd_parse: cb_header_field: headers begin
D (12974417) httpd_txrx: httpd_unrecv: length = 112
D (12974417) httpd_parse: pause_parsing: paused
D (12974427) httpd_parse: cb_header_field: processing field = Host
D (12974437) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974437) httpd_txrx: httpd_recv_with_opt: pending length = 112
D (12974447) httpd_parse: read_block: received HTTP request block size = 112
D (12974457) httpd_parse: continue_parsing: skip pre-parsed data of size = 5
D (12974457) httpd_parse: continue_parsing: un-paused
D (12974467) httpd_parse: cb_header_value: processing value = ble2mqtt-d4ac.lan
D (12974477) httpd_parse: cb_header_field: processing field = User-Agent
D (12974477) httpd_parse: cb_header_value: processing value = Mozilla/5.0 (Wind9
D (12974487) httpd_parse: parse_block: parsed block size = 112
D (12974497) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974507) httpd_txrx: httpd_recv_with_opt: received length = 128
D (12974507) httpd_parse: read_block: received HTTP request block size = 128
D (12974517) httpd_parse: cb_header_value: processing value = 9.0
D (12974527) httpd_parse: cb_header_field: processing field = Accept
D (12974527) httpd_parse: cb_header_value: processing value = text/html,applica8
D (12974547) httpd_parse: cb_header_field: processing field = Accept-Language
D (12974547) httpd_parse: cb_header_value: processing value = en
D (12974557) httpd_parse: cb_header_field: processing field = Accept-
D (12974567) httpd_parse: parse_block: parsed block size = 240
D (12974567) httpd_txrx: httpd_recv_with_opt: requested length = 128
D (12974577) httpd_txrx: httpd_recv_with_opt: received length = 101
D (12974587) httpd_parse: read_block: received HTTP request block size = 101
D (12974587) httpd_parse: cb_header_field: processing field = Encoding
D (12974597) httpd_parse: cb_header_value: processing value = gzip, deflate
D (12974607) httpd_parse: cb_header_field: processing field = DNT
D (12974607) httpd_parse: cb_header_value: processing value = 1
D (12974617) httpd_parse: cb_header_field: processing field = Connection
D (12974617) httpd_parse: cb_header_value: processing value = keep-alive
D (12974627) httpd_parse: cb_header_field: processing field = Upgrade-Insecure-s
D (12974637) httpd_parse: cb_header_value: processing value = 1
D (12974637) httpd_parse: cb_header_field: processing field = Sec-GPC
D (12974647) httpd_parse: cb_header_value: processing value = 1
D (12974657) httpd_parse: cb_headers_complete: bytes read = 359
D (12974657) httpd_parse: cb_headers_complete: content length = 0
D (12974667) httpd_parse: pause_parsing: paused
D (12974667) httpd_parse: cb_no_body: message complete
D (12974677) httpd_parse: httpd_parse_req: parsing complete
D (12974687) httpd_uri: httpd_uri: request for / with type 1
D (12974687) httpd_uri: httpd_find_uri_handler: [0] = /restart
D (12974697) httpd_uri: httpd_find_uri_handler: [1] = /status
D (12974697) httpd_uri: httpd_find_uri_handler: [2] = /ble/bonding_db
D (12974707) httpd_uri: httpd_find_uri_handler: [3] = /ble/devices
D (12974717) httpd_uri: httpd_find_uri_handler: [4] = /ota/firmware
D (12974717) httpd_uri: httpd_find_uri_handler: [5] = /ota/configuration
D (12974727) httpd_uri: httpd_find_uri_handler: [6] = /fs/*
D (12974727) httpd_uri: httpd_find_uri_handler: [7] = /fs/*
D (12974737) httpd_uri: httpd_find_uri_handler: [8] = /fs/*
D (12974747) httpd_uri: httpd_find_uri_handler: [9] = /
D (12974747) HTTPD: Handling route for /
D (12974747) httpd_txrx: httpd_resp_set_hdr: new header = Content-Encoding: gzip
D (12974757) httpd_txrx: httpd_send_all: sent = 64
D (12974767) httpd_txrx: httpd_send_all: sent = 16
D (12974767) httpd_txrx: httpd_send_all: sent = 2
D (12974777) httpd_txrx: httpd_send_all: sent = 4
D (12974777) httpd_txrx: httpd_send_all: sent = 2
D (12974787) httpd_txrx: httpd_send_all: sent = 2
D (12974787) httpd_txrx: httpd_send_all: sent = 1055
D (12974797) httpd_sess: httpd_sess_process: httpd_req_delete
D (12974797) httpd_sess: httpd_sess_process: success
D (12974807) httpd: httpd_server: doing select maxfd+1 = 58
I (12980727) wifi:bcn_timout,ap_probe_send_start
W (12981737) wifi:m f null
I (12983237) wifi:ap_probe_send over, resett wifi status to disassoc
I (12983237) wifi:state: run -> init (c800)
D (12983237) wifi:connect status 5 -> 3
D (12983237) wifi:send diassoc for beacon timeout(200)
D (12983247) wifi:sta leave
I (12983247) wifi:pm stop, total sleep time: 197398855 us / 233968858 us
D (12983247) wifi:stop CSA timer
W (12983257) wifi:<ba-del>idx
W (12983257) wifi:<ba-del>idx
D (12983257) wifi:remove xx:xx:xx:xx:cd:2e from rc list
I (12983267) wifi:new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1
D (12983267) wifi:filter: set rx policy=8
D (12983277) wifi:Send disconnect event, reason=200, AP number=0
D (12983277) wifi:connect status 3 -> 0