balboa_GL_ML_spa_control icon indicating copy to clipboard operation
balboa_GL_ML_spa_control copied to clipboard

Failed to Connect to WiFi, Rebooting

Open KSchmeeds opened this issue 1 year ago • 28 comments

I am attempting to setup this project on one of these boards, however it appears the device gets in a boot loop attempting to try to connect to WiFi. I see the device logs show "connecting to my SSIDNAME" and then after a few seconds it says failed to connect, rebooting.

I tried a few different ESP32 board options in the platformio.ini, but received the same results. I did see an arp entry for the device briefly show up on my network, and then it dropped.

This device works fine on my WiFi if using ESPHome.

I'm willing to purchase a different model board if you have a recommendation.

KSchmeeds avatar Apr 11 '24 04:04 KSchmeeds

Just sounds like either your settings are wrong or your WiFi signal is too weak. Can you try with the ESP32 close to your router?

netmindz avatar Apr 11 '24 16:04 netmindz

I'm assuming the log example you gave is hiding your network name for privacy reasons a and you actually see the right name?

netmindz avatar Apr 11 '24 16:04 netmindz

Signal Strength isn't an issue, as I have adequate 2.4ghz coverage in that location, and there are several other ESP devices nearby connected without any issues. It also works fine if I just use ESPHome on the device.

Yes, I do see my correct SSID, redacted for privacy. I do see it connect to the network successfully from my authentication logs on my network controller, but the device just reboots.

Any recommendation for the platformio env I should try for this board I'm using?

KSchmeeds avatar Apr 11 '24 18:04 KSchmeeds

I did some more testing and it appears to be rebooting after connecting, usually it just says failed to connect and then invokes a reboot, other times it gets an IP and then crashes shortly after.

KSchmeeds avatar Apr 12 '24 02:04 KSchmeeds

If you can add this to the env and then send the log of the crash I can help diagnose

monitor_filters = esp32_exception_decoder

netmindz avatar Apr 12 '24 08:04 netmindz

Can you also test if the crash stops if you disconnect the rs485 connection. It's possible you have A+B round the wrong way, which gives corrupt data

netmindz avatar Apr 12 '24 08:04 netmindz

I don't have the rs485 connected yet, I just have the esp32 by itself. Is that perhaps part of the problem?

I'll work on getting those debug logs collected.

KSchmeeds avatar Apr 12 '24 13:04 KSchmeeds

I setup the logging with the exception decoder (using your commit to the main branch), but this is the only info I am getting. I don't think the ESP32 is crashing, but rather something in the code thinks the WiFi isn't connecting so it marks it as failed and triggers the reboot. I do see it on my network controller briefly connect, so the credentials are correct and it's able to pull an IP.

You can see a couple times in here I also used the reset button on the ESP32 to reboot it, as the terminal appeared to be hung after saying Wifi failed, reboot.

`)

configsip: 0, SPIWP:0

Connecting to myredactedssidname....................

Wifi failed, reboot ets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0030,len:1184

load:0x40078000,len:13232

load:0x40080400,len:3028

entry 0x400805e4

Connecting to myredactedssidname....................

Wifi failed, reboot ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0030,len:1184

load:0x40078000,len:13232

load:0x40080400,len:3028

entry 0x400805e4

Connecting to myredactedssidname....................

Wifi failed, reboot ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0030,len:1184

load:0x40078000,len:13232

load:0x40080400,len:3028

entry 0x400805e4

Connecting to myredactedssidnameets Jun 8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0030,len:1184

load:0x40078000,len:13232

load:0x40080400,len:3028

entry 0x400805e4

Connecting to myredactedssidname....................

Wifi failed, reboot ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0030,len:1184

load:0x40078000,len:13232

load:0x40080400,len:3028

entry 0x400805e4

Connecting to myredactedssidname....................

Wifi failed, reboot ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

configsip: 0, SPIWP:0xee

clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00

mode:DIO, clock div:2

load:0x3fff0030,len:1184

load:0x40078000,len:13232

load:0x40080400,len:3028

entry 0x400805e4

Connecting to myredactedssidname.............`

KSchmeeds avatar Apr 15 '24 01:04 KSchmeeds

I sometimes see connection errors for WiFi, but normally works ok. If it's being tricky then sometimes turning the power off for 30 seconds can help.

In order to test the Comms to the spa, you can try enabling the option that triggers the sensor to swap to AP mode of WiFi fails rather than rebooting. You could then view the status page to see if that side is working ok once we sort your WiFi issue.

If it just regular WiFi you have or are you using any repeaters or mesh style setup?

netmindz avatar Apr 16 '24 13:04 netmindz

I don't have the esp32 connected yet to the spa or the rs485 board, I'm just running it plugged in to my PC.

I'm wondering if the device is not giving enough time from when it associates to the network and receives an IP address before it marks it as failed.

I'm using Ubiquiti APs. I have about a dozen other esp32 and esp8266 devices running esphome that have no connection issues.

KSchmeeds avatar Apr 16 '24 13:04 KSchmeeds

You can try upping the number of retries in the code before it gives up. There is really nothing exciting in doing in the code relating to WiFi, just the most basic connect code

netmindz avatar Apr 18 '24 18:04 netmindz

WiFi.begin

Then 10 seconds to connect

netmindz avatar Apr 18 '24 18:04 netmindz

I figured out the issue. If I have the serial terminal open, or I connect to the serial terminal while the Wi-Fi is connected, it causes the device to enter a Wi-Fi connection reboot loop. As soon as I close any open serial connections and reboot the device, it connects just fine.

KSchmeeds avatar Apr 23 '24 01:04 KSchmeeds

How weird, I've not seen that before. Some of the newer chips like the S3 and C3 have quirky Serial support but the traditional ESP32 shouldn't care if you are connected or not.

Are you fully up and running now then?

netmindz avatar Apr 23 '24 18:04 netmindz

I'm not connected to the tub yet, as I'm still trying to figure out how the WebOTA feature works so I can update the firmware/code remotely without needing to connect locally to it via USB.

KSchmeeds avatar Apr 29 '24 00:04 KSchmeeds

Shouldn't really need to update the code too much, but just goto http://hottub-sensor:8080/update

netmindz avatar Apr 29 '24 11:04 netmindz

I see you have an ESPHome branch, what is the functionality status of that? Is the wiring diagram the same?

https://github.com/netmindz/balboa_GL_ML_spa_control/blob/ESPHome/component.yaml

I'd like to use ESPHome if it's in a usable state.

KSchmeeds avatar May 05 '24 20:05 KSchmeeds

Short answer is that it's not, see https://github.com/netmindz/balboa_GL_ML_spa_control/discussions/40

netmindz avatar May 07 '24 18:05 netmindz

I hooked everything up and installed it, but none of the data was sending or receiving to the tub. I knew the MQTT was working as the uptime counter was updating, but everything was showing as unknown.

I did notice on my tub that the connector is flipped 180 from your pinout diagram, so either I have a different pinout on my control board, or my rs485 adapter is dead. Is there an easy way to test the rs485 board or verify my pinout?

KSchmeeds avatar May 16 '24 21:05 KSchmeeds

Are you using a regular ESP32 and separate rs485 adapter? If so which one?

You do have panel select ("pin5") connected as well as the rs485 yeah?

Can you confirm the model number of your controller and send a photo please?

You can use a multimeter to confirm which pins are which on your board

netmindz avatar May 17 '24 09:05 netmindz

Looks like I had a dead rs485 chip, and also the wiring diagram on the instruction page shows the wrong pin for one of the connections. I'll make a new diagram for you once I get everything up and running.

I can send and receive data from the tub now, however if I send some commands like changing the jet speed, or pressing the time button, it gets stuck doing the command indefinitely approximately every 1 second until I reboot or unplug the ESP32.

For example, if I press the "Time" button, it keeps cycling back and forth between time and temperature. The control panel also is non responsive as the ESP32 is holding it hostage.

Also, I noticed in Home Assistant the decimal place for target temperature and tub temperature are in the wrong place. For example, 103F shows as 10.3F.

KSchmeeds avatar May 18 '24 00:05 KSchmeeds

I can send and receive data from the tub now, however if I send some commands like changing the jet speed, or pressing the time button, it gets stuck doing the command indefinitely approximately every 1 second until I reboot or unplug the ESP32.

Interestingly, this is the same that was happening to me. Unfortunately we have had our tub empty for a while now, but when we've cleaned it and re-filled it, will be interested to dig into this further.

davewatson91 avatar May 18 '24 01:05 davewatson91

Good to hear you have made some progress @KSchmeeds

You say that some commands get stuck in a loop, but do any of them work? How about toggling the light?

I haven't tested with a tub set in fahrenheit, so it might be an idea to flip the dip switch to change the mode and see if you have any better luck, then we can debug any specific issues relating to fahrenheit

The loop is caused by the fact that as I haven't seen 100% reliable sending of commands I keep retrying until I see a change in the state data returned by the tub, but it's possible if we send a command that your controller does understand or expect, like a time command when the config is set not to expect that, your controller would just ignore that command and so we get stuck in a loop

netmindz avatar May 19 '24 09:05 netmindz

The light command works perfectly. The other commands I tested work (time, jet1, jet2), except after issuing the command it keeps sending it endlessly.

KSchmeeds avatar May 19 '24 14:05 KSchmeeds

So a change to the jets just then keeps turning on and off?

Do you have single speed or dual speed pumps?

netmindz avatar May 20 '24 09:05 netmindz

Dual speed pumps. The time button is showing similar behavior of repeatedly pressing itself after pressing it once.

On Mon, May 20, 2024, 4:35 AM netmindz @.***> wrote:

So a change to the jets just then keeps turning on and off?

Do you have single speed or dual speed pumps?

— Reply to this email directly, view it on GitHub https://github.com/netmindz/balboa_GL_ML_spa_control/issues/75#issuecomment-2120062754, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYUGU2HMUVUW7JMEHXBG33ZDG7VRAVCNFSM6AAAAABGBS3ILKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRQGA3DENZVGQ . You are receiving this because you were mentioned.Message ID: @.***>

KSchmeeds avatar May 20 '24 13:05 KSchmeeds

If you existing panel doesn't have a time button then your controller won't be confirmed to expect that. So just don't press it.

I've only got single speed pumps, so I've not tested the send command retrying with dual speed

netmindz avatar May 21 '24 13:05 netmindz

My tub is 2x dual speed pumps, and does have a time button on the control pad.

KSchmeeds avatar May 21 '24 22:05 KSchmeeds

I'm still interested in trying to get this to work, do you have any next steps for me to try about the endless command loop?

KSchmeeds avatar Jul 08 '24 23:07 KSchmeeds

It's best we start a new issue as this is no longer about Wifi failure

https://github.com/netmindz/balboa_GL_ML_spa_control/issues/78

netmindz avatar Jul 09 '24 07:07 netmindz