AI-on-the-edge-device icon indicating copy to clipboard operation
AI-on-the-edge-device copied to clipboard

Wifi Connection reduced with 15.7.0

Open JensHoRi opened this issue 1 year ago • 13 comments

The Problem

I just updated from 15.4.0 to 15.7.0

While connection 1ith 15.4.0 was o.k. (ESP is for electricity in the measurement cabinet), now its Unreliable (-91dBm).

Every 1 to 3 rounds the esp is rebooting now, because not able to send MQTT. Wouldn't it make more sense just to reconnect wiFi instead of full reboot?

Had there been a WiFi driver change between this versions?

Version

15.7.0

Logfile

[0d00h00m09s] 2024-05-02T22:35:00 <INF> [WIFI] Assigned IP: 10.10.10.164
[0d00h00m11s] 2024-05-02T22:34:57 <INF> [SNTP] Time is synced with NTP Server 10.10.10.201: 2024-05-02 22:34:57
[0d00h00m12s] 2024-05-02T22:34:59 <INF> [MAIN] Device info: CPU cores: 2, Chip revision: 100
[0d00h00m12s] 2024-05-02T22:34:59 <INF> [MAIN] SD card info: Name: APPSD, Capacity: 3840MB, Free: 3818MB
[0d00h00m14s] 2024-05-02T22:35:01 <INF> [MAIN] Initialization completed successfully
[0d00h00m16s] 2024-05-02T22:35:03 <INF> [LOGFILE] Set log level to WARNING
[0d00h01m01s] 2024-05-02T22:35:47 <ERR> [POSTPROC] main: Raw: 001011.98, Value: , Status: Neg. Rate - Read: - Raw: 001011.98 - Pre: 1012.08
[0d00h04m52s] 2024-05-02T22:39:38 <WRN> [WIFI] Disconnected (7)
[0d00h04m52s] 2024-05-02T22:39:38 <WRN> [MQTT IF] Disconnected, trying to reconnect
[0d00h04m55s] 2024-05-02T22:39:42 <WRN> [WIFI] Disconnected (39)
[0d00h04m58s] 2024-05-02T22:39:44 <WRN> [WIFI] Disconnected (205)
[0d00h05m01s] 2024-05-02T22:39:48 <ERR> [POSTPROC] main: Raw: 001011.99, Value: , Status: Neg. Rate - Read: - Raw: 001011.99 - Pre: 1012.08
[0d00h05m01s] 2024-05-02T22:39:48 <WRN> [MQTT SERVER] Unable to send System Topics, we are not connected to the MQTT broker!
[0d00h05m01s] 2024-05-02T22:39:48 <WRN> [MQTT] One or more MQTT topics failed to be published!
[0d00h00m00s] 2024-05-02T20:41:18 <INF> [MAIN] =================================================
[0d00h00m00s] 2024-05-02T20:41:18 <INF> [MAIN] ==================== Start ======================
[0d00h00m00s] 2024-05-02T20:41:18 <INF> [MAIN] =================================================
[0d00h00m00s] 2024-05-02T20:41:18 <INF> [MAIN] PSRAM size: 8388608 byte (8MB / 64MBit)
[0d00h00m00s] 2024-05-02T20:41:18 <INF> [MAIN] Total heap: 4380067 byte
[0d00h00m04s] 2024-05-02T20:41:22 <INF> [MAIN] Camera info: PID: 0x26, VER: 0x42, MIDL: 0x7f, MIDH: 0xa2
[0d00h00m04s] 2024-05-02T20:41:22 <INF> [SDCARD] Basic R/W check started...
[0d00h00m04s] 2024-05-02T20:41:23 <INF> [SDCARD] Basic R/W check successful
[0d00h00m05s] 2024-05-02T20:41:23 <INF> [SNTP] TimeServer: 10.10.10.201
[0d00h00m05s] 2024-05-02T20:41:23 <INF> [SNTP] Configuring NTP Client...
[0d00h00m05s] 2024-05-02T22:41:23 <INF> [SNTP] Time zone set to CET-1CEST,M3.5.0,M10.5.0/3

Expected Behavior

No response

Screenshots

No response

Additional Context

No response

JensHoRi avatar May 02 '24 20:05 JensHoRi

As far as I know, nothing has been changed on our part regarding the WLAN implementation. However, I cannot say whether the update from [email protected] to [email protected] changed the WiFi implementation.

the current rolling with [email protected] for testing: AI-on-the-edge-device__update__rolling_(4049d75).zip

SybexX avatar May 03 '24 10:05 SybexX

Thank you very much. So a special version to test: I only need to install the firmware via OTA to test, correct?

JensHoRi avatar May 03 '24 21:05 JensHoRi

@JensHoRi yes this is for OTA update

SybexX avatar May 04 '24 04:05 SybexX

O.k., lets see. I updated and was needed to fully configure everything again... :-( By the way I saw a new parameter called rssi Treshold maybe responible for all the reboots of the device. Turned it off now. First minutes doesn't show a positive effect. Very interesting: Unbenannt

Just installed new with special espressif. But until yet it looks like still same bad connection! Very confusing, because nothing else had been changed (WIFI Routers still same version, same position)

JensHoRi avatar May 04 '24 10:05 JensHoRi

What is the temperature of the ESP, has it increased compared to version 15.4.0? The ESP may reduce the transmission power because it gets too hot. With more functions and calculations, the temperature inevitably rises. My ESP is in a waterproof housing and through the new version the temperature is approx. 5°C higher.

SybexX avatar May 04 '24 11:05 SybexX

@JensHoRi yes this is for OTA update

@SybexX: It is very welcome when you try to help. But please be fair to the user and inform upfront that this NOT an original OTA update package because config.ini gets overwritten which results in loosing their complete configuration. So they have at least the chance to backup before using such update packages with identical names but different content.

Slider0007 avatar May 04 '24 12:05 Slider0007

Temperature is about 10 degrees lower than before. Around 80 Celsius

JensHoRi avatar May 04 '24 12:05 JensHoRi

80°C seems very hot to me, runs it at 240 MHz? Mine runs at 53°C and with version 15.4.0 it was 48°C.

SybexX avatar May 04 '24 13:05 SybexX

Well, I have got 4 devices bought in two orders. The first two are like yours around 45-55 degrees, while the two from the second order always hat been some hotter. The quality of esps is very different. :/

JensHoRi avatar May 04 '24 14:05 JensHoRi

Do the other two also have problems with the WiFi or only the ESPs that get so hot?

SybexX avatar May 04 '24 14:05 SybexX

The other are not in a steel cabinet, so in general better wifi. One of them is the hotter one, so that’s just because of the different manufacturer I believe.

JensHoRi avatar May 04 '24 16:05 JensHoRi

I went back to 15.4.0 again. rssi is still better as with 15.7.0, no idea why.

Think I should indead train my eyes and change contacts of ESP to external WiFi adress.

If I'm the only one recognizing wifi issues, you can close this issue.

JensHoRi avatar May 05 '24 15:05 JensHoRi

If you want you can test this version, for me the RSSI went down from -47 to -42. However, my Fritzbox has been going crazy lately, so I can't say with certainty whether the improvement is caused by my change or by the Fritzbox^^

AI-on-the-edge-device__update__rolling_(4049d75).zip

SybexX avatar May 09 '24 23:05 SybexX

Any news here? There soon will be a new release. @JensHoRi it would be great if you could try the new release resp., rolling. But be aware that there where no functional changes related to your issue. Also I think if there really would be a regression in 15.7.0, more people would have had complained.

caco3 avatar Aug 25 '24 17:08 caco3

I was on v15.6.0 before today. According to release notes it is after update of esp-idf to v5.1. Recently I have noticed that the device started to frequently disconnect.

...
[0d00h00m04s] 2024-08-24T01:36:54	<INF>	[MAIN] CPU frequency: 160 MHz
[0d00h00m05s] 2024-08-24T01:36:55	<INF>	[SDCARD] Folder/file presence check started...
[0d00h00m05s] 2024-08-24T01:36:55	<INF>	[SDCARD] Folder/file presence check successful
[0d00h00m05s] 2024-08-24T01:36:55	<INF>	[MAIN] Tag: 'v15.6.0', Release: v15.6.0 (Commit: 8d6f731+), Date/Time: 2024-02-09 13:01, Web UI: Release: v15.6.0 (Commit: 8d6f731+)
[0d00h00m05s] 2024-08-24T01:36:55	<WRN>	[MAIN] Reset reason: Exception/panic
[0d00h00m05s] 2024-08-24T01:36:55	<WRN>	[MAIN] Device was rebooted due to a software exception! Log level is set to DEBUG until the next reboot. Flow init is delayed by 5 minutes to check the logs or do an OTA update
[0d00h00m05s] 2024-08-24T01:36:55	<WRN>	[MAIN] Keep device running until crash occurs again and check logs after device is up again
...

I tried fw 15.7.0 release some moths ago but it had a bug with reference image so I could not proceed with the upgrade. Today I have installed SybexX's AI-on-the-edge-device__update__rolling_(4049d75).zip release. Reboots seems to have gone away. I will post RSSI levels later today to compare. CPU temperature is between 39 and 42 C for me.

yozik04 avatar Aug 26 '24 08:08 yozik04

@yozik04 Please use the official rolling to be able to give a good feedback! Custom builds are great to quickly test something but can not help to see if the official builds work well.

caco3 avatar Aug 26 '24 15:08 caco3

@caco3 this is the normal rolling + WiFi_Power_Save_Type: WIFI_PS_NONE / esp_wifi_set_ps(WIFI_PS_NONE);

SybexX avatar Aug 26 '24 16:08 SybexX

Ah, OK.

IMO we should make sure not to have too many different open feature branches. Better add it step by step to rolling so people can test it and find regressions.

Also note that I just made a branch with the lastest ESP IDF: https://github.com/jomjol/AI-on-the-edge-device/pull/3196 It seems to run well, at least according to the UART log. How ever none of the devices can be accessed through Wifi :(

caco3 avatar Aug 26 '24 16:08 caco3

Here are promised RSSI rates: image

Gaps in the metrics are when device was not available due to reconnects/reboots.

Starting from 11am I am on the build that @SybexX provided and it works fine. If you want me to try some other builds then I am open for testing.

yozik04 avatar Aug 26 '24 17:08 yozik04

That looks good!

@SybexX Are those config changes something we can select by default or will we need to add config parameters?

caco3 avatar Aug 26 '24 18:08 caco3

https://github.com/jomjol/AI-on-the-edge-device/pull/3186#issuecomment-2308923496 https://github.com/jomjol/AI-on-the-edge-device/issues/3122

SybexX avatar Aug 26 '24 18:08 SybexX

My comments on https://github.com/jomjol/AI-on-the-edge-device/pull/3186#issuecomment-2308923496. I have 3 access points in my house and I see that my ESP32 tried to roam between these.

yozik04 avatar Aug 26 '24 18:08 yozik04

=) And all 3 APs are Mikrotik ones.

yozik04 avatar Aug 26 '24 19:08 yozik04

@yozik04 How big is the RSSI difference between the 3 APs? @caco3 Maybe we should increase the threshold value at which the ESP32 changes the AP a little? currently it is "RSSI is better than actual RSSI + 5 --> Avoid switching to AP with roughly same RSSI"

SybexX avatar Aug 26 '24 19:08 SybexX

IMO I would say 5 dBm sound ok. 10 would already be a lot. It would be interesting to se the log and how often it switches.

caco3 avatar Aug 26 '24 20:08 caco3

@SybexX, @caco3 Here is one hour log from the main router which manages all 3 AP's via CAPSMAN. Signal level from two AP's is almost the same.

 08:02:15 wireless,info 10:32:1E:5C:D4:40@2G-Living room wAP ac2 roamed to 10:32:1E:5C:D4:40@2G-Hall cAP ac2, signal strength -55
 08:02:15 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:02:15 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:07:46 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 roamed to 10:32:1E:5C:D4:40@2G-Living room wAP ac2, signal strength -65
 08:07:46 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:07:46 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:13:18 wireless,info 10:32:1E:5C:D4:40@2G-Living room wAP ac2 reconnecting, signal strength -61
 08:13:25 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 connected, signal strength -64
 08:13:25 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:13:25 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:18:48 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 roamed to 10:32:1E:5C:D4:40@2G-Living room wAP ac2, signal strength 41
 08:18:48 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:18:48 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:24:20 wireless,info 10:32:1E:5C:D4:40@2G-Living room wAP ac2 reconnecting, signal strength -58
 08:24:28 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 connected, signal strength -63
 08:24:28 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:24:28 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:29:50 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 roamed to 10:32:1E:5C:D4:40@2G-Living room wAP ac2, signal strength -54
 08:29:50 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:29:50 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:35:36 wireless,info 10:32:1E:5C:D4:40@2G-Living room wAP ac2 reconnecting, signal strength -58
 08:35:43 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 connected, signal strength -64
 08:35:43 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:35:43 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:41:07 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 reconnecting, signal strength -55
 08:41:15 wireless,info 10:32:1E:5C:D4:40@2G-Office hAP ax^22 connected, signal strength -90
 08:41:15 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:41:15 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:46:37 wireless,info 10:32:1E:5C:D4:40@2G-Office hAP ax^22 roamed to 10:32:1E:5C:D4:40@2G-Living room wAP ac2, signal strength -59
 08:46:37 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:46:37 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:52:09 wireless,info 10:32:1E:5C:D4:40@2G-Living room wAP ac2 reconnecting, signal strength -61
 08:52:17 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 connected, signal strength -64
 08:52:17 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:52:17 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:58:06 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 reconnecting, signal strength -63
 08:58:14 wireless,info 10:32:1E:5C:D4:40@2G-Office hAP ax^22 connected, signal strength -87
 08:58:14 dhcp,info dhcp-local deassigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter
 08:58:14 dhcp,info dhcp-local assigned 192.168.1.17 for 10:32:1E:5C:D4:40 watermeter

yozik04 avatar Aug 26 '24 21:08 yozik04

Thanks for the log.

Did you adjust some parts of the log? There are some odd lines:

  • 08:18:48 wireless,info 10:32:1E:5C:D4:40@2G-Hall cAP ac2 roamed to 10:32:1E:5C:D4:40@2G-Living room wAP ac2, signal strength 41 -> why is 41 positive?
  • 08:41:15 wireless,info 10:32:1E:5C:D4:40@2G-Office hAP ax^22 connected, signal strength -90 -> That AP seems to be farther away and as obviously a weaker signal. So wo does the ESP connect to that one?

Can you also provide the log from the ESP?

caco3 avatar Aug 26 '24 21:08 caco3

I have replaced mac address in full, but not any signal levels. That is exactly what is in log.

Why it has connected to Office AP? I have no idea. It is far away from it. I have a feeling that it connects to the first one from what it hears a broadcast after a boot. Maybe Office one was the first.

yozik04 avatar Aug 27 '24 04:08 yozik04

I have not implemented it myself, but indeed it might be posssible it connects initially to the first AP it sees and only later roames to the best one. To understand that better, the log is needed. if you have the logging set to at least INFO, you could grab the logfile of that day from the SD card!

caco3 avatar Aug 27 '24 05:08 caco3

@caco3 Regarding signal strength with + sign. It is the only occasion on that day. So let's just ignore it.

Here is log from ESP. It did a lot of reboots so it has DEBUG levels there as well. I am not 100% sure if I took the same hour from the log to compare with the log from the router. esp32.log

yozik04 avatar Aug 27 '24 06:08 yozik04