diyBMSv4ESP32
diyBMSv4ESP32 copied to clipboard
Lost WIFI connection that is not restored until the controller is restarted (mesh network?)
Describe the bug Quite often lately, up to three times a day, the Wi-Fi connection between the controller and the Wi-Fi network is lost, and the controller cannot be accessed. This does not influence its operation since it is communicated by Modbus with a CERBO, but the controller stops sending monitoring data via MQTT. It surprises me that the controller itself, once the Wi-Fi connection is lost, does not try to reestablish it again.
Hardware/Software Versions Controller version (from PCB): 4.2 (below can be obtained from the "About" page in the controller web interface) Processor: ESP32 Version: f6e94fbd18cf7803e818459735021730541ce20b Compiled: 2023-09-08T14:51:55.713Z
To Reproduce It is an error that cannot be caused, it appears occasionally without my having been able to establish a cause for it. It could be something within the function to automatically re-establish a Wi-Fi communication in the event that it has been lost for any reason.
Same problem here after restarting my router last night.
With the preview firmware version from June 14, the problem never occurred. I think the problem only occurs with the latest firmware version.
The controller code does attempt to reconnect to the WIFI, but something has obviously changed/not working. The IDF firmware in the latest release has been upgraded to 4.4.5 - which could be the underlying issue.
To resolve this I need:
- The signal level from the TFT display (dBM) value
- The USB serial output from the ESP32 when this problem occurs
I'm sorry I'm not very useful, first of all I wouldn't know how to measure the DBM value. And I don't have the controller in a place where I can easily get the serial USB Ouput
The DBM value is shown on the TFT screen (bottom right before the time) - mine reports -74dBM - its the WIFI signal strength.
I've just pushed another release up which changes to ESP Arduino framework 2.0.12 - version 2.0.11 had a bug in it which caused lots of memory to be used. This could be a reason why WIFI became unstable.
New release is here. Please try this one and see if the WIFI problem is resolved.
https://github.com/stuartpittaway/diyBMSv4ESP32/releases/tag/Tag-2023-09-12-10-49
Updated, I will comment in a couple of days if the problem has stopped occurring. Regarding the strength of the WIFI signal, it has a value of -88dbm
20 hours and the Wi-Fi communication has not been lost, or if it has been done punctually, it has been reestablished without being able to be seen. Stuart, what you did worked. Thank you.
Thanks for the update @chapulino, I'll leave the ticket open for now. Just a note that your WIFI signal strength of -88dBM is quite low. You might want to consider using the ESP32 with an external antenna.
@chapulino What happens if you turn off the WiFi and turn it back on again after a few minutes? Is it reconnecting?
@chapulino You wrote: This does not influence its operation since it is communicated by Modbus with a CERBO, but the controller stops sending monitoring data via MQTT
Is that right or do you mean diyBMS CAN-bus communication with Cerbo?
What happens if you turn off the WiFi and turn it back on again after a few minutes? Is it reconnecting?
I have not performed that test, although I imagine that with Stuar's correction, the system reconnects automatically.
@chapulino You wrote: This does not influence its operation since it is communicated by Modbus with a CERBO, but the controller stops sending monitoring data via MQTT
Is that right or do you mean diyBMS CAN-bus communication with Cerbo?
There is a Modbus cable communication with the Victron system. This communication was not affected before Stuart's correction. The only thing that was lost, by not having WIFI communication, was the MQTT transmission. In any case, it has now been almost 48 hours without loss of connection, or if they could be lost occasionally, the system will automatically restore it. The problem that caused this ticket is that the Wifi connection was lost and will not be established automatically
There is a Modbus cable communication with the Victron system.
Sorry chapulino, I don´t understand that..... you mean that you have a modbus cable connection from diyBMS to Victron cerbo? But the diyBMS doesn´t have Modbus.....only CAN-Bus to victron CAN-Bus-Battery. I´m using the diyBMS CAN-Bus to Victron Battery CAN-Bus and there is also no communication break if WiFi is down. Modbus I use with Cerbo GX to read and write values by ioBroker, but not with diyBMS. diyBMS-MQTT I also use whis ioBroker-Adapter to read values and visualize in ioBroker VIS. diyBMS-MQTT is over WiFi.
Sorry for the confusion, I was actually referring to cambus
I confused the name of the protocol, I also use modbus to communicate my Solax inverters with CERBO
@chapulino Hi chapulino, have you been able to try out what happens when you turn your wifi off and on again? Or do you only switch it on again after 5 minutes? That would be a great help for me - thanks :-)
If I turn the controller off and on again, and there is a sufficient Wi-Fi signal, it connects and does not present any problem. The problem described was solved with the update. Now I'm waiting for an ESP32 with an antenna to improve the WiFi signal
@chapulino Thank you very much! Did you wait a few minutes? After a few minutes my ESP32 won‘t connect to wifi again :-( I also use IOBroker as MQTT-Server. Maybe my Problem is the MQTT connection.
When you connect to Wi-Fi, what Db value (Wifi signal strength) do you have?
you have the last version instaled ? https://github.com/stuartpittaway/diyBMSv4ESP32/releases/tag/Tag-2023-09-12-10-49
Wifi has -40dBm. Yes, same firmware. I made a little test and deactivated MQTT in diyBMS and ioBroker. Thats it! MQTT is the murder! Without MQTT, wifi is 30sec after presence up on diyBMS.
At the time it was reported that having the two integrations activated at the same time, MQTT and Influx, could cause problems, I suppose that is not the case, but to investigate more causes
Another user on Patreon has reported issues with wifi not restoring correctly - this also seems to be tied into having MQTT enabled.
Needs further investigation, as I think the wifi and MQTT libraries are not being shut down correctly when WIFI is lost.
I've made a few changes to the wifi/mqtt code to help improve the disconnect/retry scenarios.
The code is in the branch "wifi_mqtt_fixes" and a pre-compiled version is available from this GITHUB action...
https://github.com/stuartpittaway/diyBMSv4ESP32/actions/runs/6695228114
(look at the artifacts at the bottom of the page)
Hello - I just performed a test with Release-2023-10-30-10-43. After about 15hours I get the following: E (6666795) MQTT_CLIENT: Error transport connect E (6666799) diybms-mqtt: MQTT_EVENT_ERROR E (6666803) diybms-mqtt: Last err no string (Success) I (6666808) diybms-mqtt: MQTT_EVENT_DISCONNECTED E (6666806) diybms-mqtt: MQTT enabled. WIFI not connected I (6667900) diybms-mqtt: Connect MQTT E (6667900) TRANSPORT_WS: esp_transport_ws_init(621): Memory exhausted E (6667901) MQTT_CLIENT: create_client_data(765): Memory exhausted E (6667904) diybms-mqtt: mqtt_client returned NULL E (6668264) diybms-mqtt: MQTT enabled. WIFI not connected E (6670313) diybms: Connect to the Wifi AP failed E (6670330) esp-tls: [sock=51] connect() error: Host is unreachable E (6670331) TRANSPORT_BASE: Failed to open a new connection: 32772
After a while it just stays in a loop: E (6670331) MQTT_CLIENT: Error transport connect E (6670336) diybms-mqtt: MQTT_EVENT_ERROR E (6670339) diybms-mqtt: Last err no string (Success) I (6670344) diybms-mqtt: MQTT_EVENT_DISCONNECTED E (6670368) esp-tls: [sock=51] connect() error: Host is unreachable E (6670369) TRANSPORT_BASE: Failed to open a new connection: 32772 E (6670369) MQTT_CLIENT: Error transport connect E (6670374) diybms-mqtt: MQTT_EVENT_ERROR E (6670377) diybms-mqtt: Last err no string (Success) I (6670382) diybms-mqtt: MQTT_EVENT_DISCONNECTED E (6670397) esp-tls: [sock=51] connect() error: Host is unreachable E (6670397) TRANSPORT_BASE: Failed to open a new connection: 32772 E (6670398) MQTT_CLIENT: Error transport connect E (6670403) diybms-mqtt: MQTT_EVENT_ERROR E (6670406) diybms-mqtt: Last err no string (Success) I (6670411) diybms-mqtt: MQTT_EVENT_DISCONNECTED E (6670426) esp-tls: [sock=51] connect() error: Host is unreachable E (6670426) TRANSPORT_BASE: Failed to open a new connection: 32772 E (6670427) MQTT_CLIENT: Error transport connect E (6670432) diybms-mqtt: MQTT_EVENT_ERROR E (6670435) diybms-mqtt: Last err no string (Success) I (6670440) diybms-mqtt: MQTT_EVENT_DISCONNECTED E (6670455) esp-tls: [sock=51] connect() error: Host is unreachable E (6670455) TRANSPORT_BASE: Failed to open a new connection: 32772 E (6670456) MQTT_CLIENT: Error transport connect E (6670461) diybms-mqtt: MQTT_EVENT_ERROR E (6670464) diybms-mqtt: Last err no string (Success) I (6670469) diybms-mqtt: MQTT_EVENT_DISCONNECTED E (6671264) diybms-mqtt: MQTT enabled. WIFI not connected I (6671800) diybms: Task 2 E (6671823) diybms-mqtt: MQTT enabled. WIFI not connected E (6674264) diybms-mqtt: MQTT enabled. WIFI not connected I (6674583) diybms: Time now: Mon Nov 6 06:24:01 2023
Hi @GregoryFi, how are you testing this - disabling WIFI on the access point?
It looks like the controller is staying "up" rather than rebooting/running into issues. Did the WIFI reconnect successfully after the test?
I can see the MQTT has run into problems with memory allocation.
Hi Stuart - I connected a laptop with USB through serial port with the controller and Putty to log, and left it running until I saw I could no longer access the controller through local ip. Wifi was never disabled, but also never restored until I rebooted the controller (cutting the power).
Ok @GregoryFi - I can see errors similar to Connect to the Wifi AP failed
in your log file can you confirm RSSI (wifi signal strength) to the ESP32 - this is in the lower right of the TFT display if you have that fitted.
Eh, not really ... I removed the TFT as it caused issues in the past. But I can recall it was around -50 to -54dB when I had it mounted. I did have similar issues with the previous controller type (using ESP8266) - it was the main reason I wanted to upgrade my controller; and sometimes the ESP32 in my OPenEVSE disconnects, but it always reconnects. The main difference in wifi configuration is I have attributed a fixed IP to the controller (as I thought DNS was causing the issues), and the ESP32 in the OpenEVSE switches ip dynamically almost every time it resets ...
Ok, I don't really see the same issues you are experiencing, my controller will stay connected to the WIFI for weeks at a time.
It looks like a poor quality WIFI connection, hence all the errors in your setup. The comments above in this issue relate to people with low signal strength (-88dBm) or who are deliberately switching off WIFI access points on a schedule/overnight.
Could this be an underlying issue with your wifi router/access point?
Probably, but I have difficulties understanding the difference between poor quality and low signal strength : the main issue is it doesn't reconnect automatically, The issues I had with the TFT installed was it would reboot all the time, ... in this case it seems an automatic reboot when the wifi connection is lost would be ideal: would there be any inconvenience in coding this ?
The issues I had with the TFT installed was it would reboot all the time, ... in this case it seems an automatic reboot when the wifi connection is lost would be ideal: would there be any inconvenience in coding this ?
The TFT issue should have been resolved if you want to try that again - there appears to be some difference between various revisions of the ESP32 hardware/chips which caused a problem.
It wouldn't be a good idea if the controller rebooted on WIFI being lost - if WIFI goes down for a long period of time, the controller would be in an endless reboot loop - and all the time, your battery is at risk of damage/over charge.
The quality of the wifi signal, rather than just signal strength if very important - the two are often linked, but it seems like a number of data packets get lost when communicating with MQTT, or for DNS lookups. Do you have any firewalls or other blocking devices on the network?