ESP32-Faikout icon indicating copy to clipboard operation
ESP32-Faikout copied to clipboard

[BUG] HTTP down

Open eskey0 opened this issue 1 year ago • 241 comments

Faikin hardware Faikin-S3-MINI-N4-R2: 91c1bc5 2024-03-31T10:59:15 S21 from Amazon

Daikin hardware FTXP35N5V1B via s403

Describe the bug The website goes down, I can control the unit via MQTT and ping it, but no HTTP or whatsoever

To Reproduce No idea, happened out of the blue, I waited to see if it comes back but no dice.

Expected behavior Have the web service working, I searched for a reboot via MQTT to see if that fix it, but found none.

Additional context I have 3 of them, all of them configured and setted up the same day, only one of them failed

eskey0 avatar Apr 08 '24 14:04 eskey0

Hmm, odd, we had this ages ago on older code with an app using the legacy URLs, but fixed long ago.

Try just power cycling or sending restart command over MQTT and see if it comes back.

Try web via IP not URL/domain in case an mDNS issue.

revk avatar Apr 08 '24 15:04 revk

Sorry I didn't specify that yes I do use direct ip address to connect to the device. After the restart command the website is up again, I don't know if you want to dig more on this, or let it be for now.

eskey0 avatar Apr 08 '24 15:04 eskey0

Ok not sure, as I say, only seen with some very specific (and now fixed) legacy IP polling. See if its happens again.

revk avatar Apr 08 '24 15:04 revk

Sure, I'll keep an eye on this, and keep you updated, thanks sir you awesome!

eskey0 avatar Apr 08 '24 15:04 eskey0

Hello there again, just a heads up, I got my second device to also "http fail", and I, again, fix it by mqtt restart, and now my 3rd device is in that state too.

EDIT: Just wanted to share the status, if no one else is experience this, maybe it's something in my setup

eskey0 avatar Apr 15 '24 07:04 eskey0

Are you using the legacy URLs / polling them?

revk avatar Apr 15 '24 07:04 revk

I just navigate to http://ipaddress in the browser, it usually just works.

eskey0 avatar Apr 15 '24 07:04 eskey0

OK but no tools, HA plug-ins, or something, that may be accessing the legacy URLs for data?

revk avatar Apr 15 '24 08:04 revk

No that Iam aware of, just HA through MQTT, nothing going for the HTTP besides my browser that I rarely use.

eskey0 avatar Apr 15 '24 08:04 eskey0

OK, as I know some HA plug-ins use the old URLs, but if using MQTT, that should be fine. Which leaves my rather puzzled at the issue, to be honest.

revk avatar Apr 15 '24 08:04 revk

It also looks timed, one failed, reboot, about 3/5 days passed by, and then the other one, and repeat. Now is the 3rd one (of 3) I can just reboot it via MQTT too and see if they start from the first one that failed.

To give you more of hindsight, I do have a more-than-average network, the Faikins also are in a restricted network, with some cameras in the same segment, with only access to HA trough MQTT, the web access from my computer, and to your update server.

I do have some plug-ins in HA, but that were for the "official" modules, and they were assigned different IP addresses, and I dissconect them from the units, so I don't think that could be an issue.

eskey0 avatar Apr 15 '24 08:04 eskey0

I have more information to share, it happened again, this time to 2 of the 3 devices I have. It happened just after I changed the wifi band on my AP, does that ring any bell? Again after sending a MQTT reboot the website goes online.

I must add I live in an appartment that is very noisy wifi wise.

eskey0 avatar Apr 17 '24 14:04 eskey0

This has just happened to me. The device is online - responds to pings, nmap can see it but not analyse it, it works on mqtt, but the webserver times out. Addressed by ip address. Webserver is up again after an mqtt restart. Uptime was a few days. Just before the web server stopped, I was looking at the page. It loaded the first time ok. Then just gave the blue screen with no buttons. On a reload it loaded it all, then timed out. Faikin-S3-MINI-N4-R2: b16bfc4 2024-08-12T14:02:04 S21 Would any more info help - wireshark capture, status output ... ?

antwin avatar Aug 18 '24 01:08 antwin

Just to check, are you using the legacy URLs? We think, somehow, there is a memory leak, possibly in the ESP IDF.

revk avatar Aug 18 '24 06:08 revk

I'm not sure what you mean by legacy URLs. I'm using the IP address (192.168.0.150) directly.

antwin avatar Aug 18 '24 09:08 antwin

I.e. a monitoring app that talks http to Faikin to get/set data. The way the old Daikin wifi modules used to work.

revk avatar Aug 18 '24 09:08 revk

I'm using Firefox to read from http://192.168.0.150 (the Faikin) on one computer. The page appears to be refreshed at intervals. I have not disconnected the original Daikin wifi module, but that has never been used, and the Daikin app is not available here.

antwin avatar Aug 18 '24 09:08 antwin

OK sounds like you are not using the legacy HTTP API then. The web page on the Faikin is not "refreshed" it uses a web socket. It should have no problem working indefinitely. I'm puzzled if you think it is being refreshed.

When we have seen issues with web server stopping it has always been down to someone using some app (not the Daikin app, usually some home assistant plug in that is not using MQTT). That polls the HTTP legacy APIs constantly, and we think there is some memory leak issue from that, but not 100% sure.

If you are not doing that, it is the first case of a problem like this.

Can you check the settings / basic page occasionally and see if the memory figures on that page are going down over time?

revk avatar Aug 18 '24 09:08 revk

First off, thanks for the prompt replies - I'm very impressed! My terminology was off. The page is updated, which is why I assumed it was refreshed. I must get the hang of websockets some day. I'm not using HA. I intend to be using MQTT sometime. I'll check the memory figures on the settings page, but it's a cold wet night here (NZ) and I'm off to bed, so there will be a pause of a day or two.

antwin avatar Aug 18 '24 09:08 antwin

Have a good night. The fact this is not using legacy HTTP APIs is interesting, and so may give us clues.

revk avatar Aug 18 '24 09:08 revk

Here are some preliminary results from status/faikin - are these what you need to see?: {"ts":"2024-08-20T05:25:26Z","id":"DC5475EF52FC","up":true,"uptime":3690,"mqtt-up":3686,"mem":119504,"spi":2090296} {"ts":"2024-08-20T08:28:48Z","id":"DC5475EF52FC","up":true,"uptime":14692,"mqtt-up":14688,"mem":119324,"spi":2090196} {"ts":"2024-08-20T22:40:31Z","id":"DC5475EF52FC","up":true,"uptime":65794,"mqtt-up":65790,"mem":119120,"spi":2090196}

antwin avatar Aug 20 '24 23:08 antwin

Ah prefect yes mem and SPI, over time.

revk avatar Aug 21 '24 06:08 revk

No http hangs for several days! More results: {"ts":"2024-08-21T09:38:24Z","id":"DC5475EF52FC","up":true,"uptime":105267,"mqtt-up":21698,"mem":119324,"spi":2090196} {"ts":"2024-08-21T23:28:38Z","id":"DC5475EF52FC","up":true,"uptime":155080,"mqtt-up":71511,"mem":118760,"spi":2090196} {"ts":"2024-08-23T23:37:50Z","id":"DC5475EF52FC","up":true,"uptime":328430,"mqtt-up":244861,"mem":118676,"spi":2090040} {"ts":"2024-08-25T05:06:19Z","id":"DC5475EF52FC","up":true,"uptime":434538,"mqtt-up":350969} mem 113600+2090108 (for some reason, it's not now reporting "mem" in status.)

antwin avatar Aug 25 '24 05:08 antwin

MQTT is working fine. BUT although HTTP is working on one device I cannot connect on a second device. Current status: {"ts":"2024-08-27T22:42:04Z","id":"DC5475EF52FC","up":true,"uptime":670681,"mqtt-up":587112,"mem":109792,"spi":2089848}

antwin avatar Aug 27 '24 23:08 antwin

OK, that means it is not a memory leak. I'll have to look at number of TCP sockets or something.

Does it eventually recover, or does it need a restart?

revk avatar Aug 28 '24 08:08 revk

The working one worked for some hours. But it has also just stopped. It stopped with just the blue background page and 'settings....' at the bottom left, so no updating. So now no http connection on either, but pings and mqtt work fine.

antwin avatar Aug 28 '24 09:08 antwin

This sounds a lot like a TCP related issue. I'll have to have a play with the options.

revk avatar Aug 28 '24 09:08 revk

I can't see any obvious issues, I have changed a TCP mailbox size in latest beta anyway.

revk avatar Sep 06 '24 09:09 revk

all, same here. the faikin esp web interface stops responding. this already happened 3rd time, interval between failures approximately 2 months. [ Faikin-S3-MINI-N4-R2: 4945ab0 2024-02-28T15:54:01 X50A ] i do have HA integration turned on, but HA talks to the device using MQTT (at least that is what its showing on HA dashboard). Thanks for pointing to mqtt 'reset' (till now i had to climb up to the ceiling-mounted daikin and unplug the ESP32 module for 30 seconds to make it work again :-) )

ataelim avatar Sep 16 '24 14:09 ataelim

I have the same issue. In my case HTTP interface is working for about 1h and then it is not reachable anymore. Reproducable behaviour. MQTT is connected to home assistant and stays responsive all the time. After reset via MQTT HTTP is working again for about 1h. Faikin-S3-MINI-N4-R2 connected to S21 of ATXM35R. It says SW is up to date.

Update: If I wait for 1-2 minutes the buttons show up on the website sometimes. Before the buttons are there it is just the blue background. But most often it just get a timeout error.

Obergangster123 avatar Nov 16 '24 09:11 Obergangster123