Shelly Door/Window 2 (battery operated, gen1) goes unavailable and stops working until integration reload
The problem
My Shelly Door Window 2 device stops reporting updates and all sensors go to unavailable. The device seems to be connecting correctly to the network, as I have an Unify network also integrated on HA and it reports the device (via MAC address) as home and away on each update (that's how battery operated Shelly devices work, they just connect to send the update, then go offline).
I can manage to get the device working fine again for a while if I reload the integration with the device "awake" (via pushing the hardware button to wake the device for a couple of minutes). Then it works for some hours, until it shows "unavailable" again.
The device has an static IP address and the CoIoT websocket correctly configured. Actually, it's been working for quite some time. Firmware is the latest available, as I just updated it when I started diagnosing this problem.
About diagnostic logs for the integration I can surely provide this but please let me know at which point should I activate it and up until which point. As the problem manifest after several hours I'm not sure if I can have the diagnostics running for so long. I have several other Shelly devices, so the file could be large.
What version of Home Assistant Core has the issue?
core-2024.5.1
What was the last working version of Home Assistant Core?
Not sure. 2024.4.x for sure, but 2024.5.0 I'm not sure.
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Shelly
Link to integration documentation on our website
https://www.home-assistant.io/integrations/shelly/
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
Hey there @balloob, @bieniu, @thecode, @chemelli74, @bdraco, mind taking a look at this issue as it has been labeled with an integration (shelly) you are listed as a code owner for? Thanks!
Code owner commands
Code owners of shelly can trigger bot actions by commenting:
@home-assistant closeCloses the issue.@home-assistant rename Awesome new titleRenames the issue.@home-assistant reopenReopen the issue.@home-assistant unassign shellyRemoves the current integration label and assignees on the issue, add the integration domain after the command.@home-assistant add-label needs-more-informationAdd a label (needs-more-information, problem in dependency, problem in custom component) to the issue.@home-assistant remove-label needs-more-informationRemove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.
(message by CodeOwnersMention)
shelly documentation shelly source (message by IssueLinks)
I'm also having issues with 5.0 and 5.1 For me it's the Shelly TRV and Shelly Motion that become unavailable and only a restart of the integration fixes that. Meanwhile they are available in the Shelly App and local IP.
Please attach diagnostics file
And please enable debug logging for Shelly integration
restart HA, wait 15 minutes, close/open the door, disable debug logging and attach here the log file.
Done. But in 15 minutes it's not failing. It take hours until the device becomes "unavailable".
config_entry-shelly-0afccd1550e7c9a04b81f0294aae0701.json home-assistant_shelly_2024-05-07T05-44-22.136Z.log
The device appears to be properly configured and data from the device is reaching the HA server. The integration marks entities as unavailable if data from the device does not reach the HA server for 12 hours * 1.2 = 14.4 hours. We need to determine what happens after this time so you have to catch this moment in the log.
The device appears to be properly configured and data from the device is reaching the HA server. The integration marks entities as unavailable if data from the device does not reach the HA server for 12 hours * 1.2 = 14.4 hours. We need to determine what happens after this time so you have to catch this moment in the log.
What if the door is not opened (the state doesn't change) in 14.4 hours? I'm guessing this Shelly devices do not perform any keep alive connection, right?
The device should send the status to the HA server every 12 hours. Regardless of whether the door was opened or not.
Integration increases this time by 20% and after this time it marks the device as unavailable.
I see. I will try to catch that with the diagnostic logs enabled.
Here are my Logs. At 12:35 and 12:54 has one TRV each changed to unavailable. Log is in my GDrive because it's too big to upload here: https://drive.google.com/file/d/1YLYLKlzLmL2hWQchVjkctPynIhh5Y8oG/view?usp=drivesdk
config_entry-shelly-ad020fc9466acbc775edf007c5df012a.json config_entry-shelly-c6550611053d9afd178192480a8cb89b.json
At 12:35 and 12:54 has one TRV each changed to unavailable
Yes, the devices did not send updates on time or the updates did not reach the HA server and the devices were marked as unavailable. I don't see anything that would indicate an integration problem.
2024-05-07 12:35:32.555 ERROR (MainThread) [homeassistant.components.shelly] Error fetching shellytrv-60A423D07C9E data: Sleeping device did not update within 3600 seconds interval
2024-05-07 12:54:00.161 ERROR (MainThread) [homeassistant.components.shelly] Error fetching shellytrv-60A423D3F87C data: Sleeping device did not update within 3600 seconds interval
But strange thing is that it stays unavailable and as soon as I reload the integration it's there again. Also in UniFi and Shelly App it's online all the time with no reconnect or so.
But strange thing is that it stays unavailable and as soon as I reload the integration it's there again.
This is how Home Assistant (Shelly integration) works, when you reload we restore the previous value and start counting the 14.4 hours again.
But the value changes like normal. It doesn't stay at one value.
For your device, TRV, sleep period is 10 minutes. So the integration will mark the device as unavailable if it does not receive data from the device for 10 * 1.2 = 12 minutes. This is why your chart looks normal. You must remember that CoIoT is UDP, packets can be lost and are not retransmitted. Often this problem is caused by network equipment. I myself once had an AP, which after several hours of work lost the CoIoT packets.
Logger: homeassistant.components.shelly Source: helpers/update_coordinator.py:347 integration: Shelly (documentation, issues) First occurred: 16:08:57 (1 occurrences) Last logged: 16:08:57
Error fetching shellymotionsensor-60A4239A65B2 data: Sleeping device did not update within 3600 seconds interval
I have the same problem also after Update from 2024.4.3 to 2024.5.2 ...... @bieniu @thecode .... It is easy to assume that this is a connection problem non related to the integration but this problem not and never even once occurred before the update to 2024.5.2, also the device stays online in the router device list and also the device interface is reachable by IP through web browser.
After a reboot the device get unavailable "OR" entities (motion, vibration, lux) get stuck in last state/value they where before or at the time of the reboot of HA and after a reload of the device within the shelly integration interface the device start report all values properly again.
No idea what changes between version 2024.4.3 and 2024.5.2 cause this behavior but for sure it is related to the integration and not to external properties./conditions.
config_entry-shelly-dfc56339e4e44286b9091eb9415f396f.json
config_entry-shelly-dfc56339e4e44286b9091eb9415f396f (1).json
the first diagnostic file is from when the device shown unavailable and the second file from after did do a reload within the shelly integration
https://github.com/home-assistant/core/assets/97987488/55a15a73-4d9f-45f7-9733-c3f0e0e0553c
in the video you can see the device is already online for over 7 hours way before I did do the reboot of HA and the device became unavailable, also you can see every 3 seconds it update the connection time when is polling for connection and as stated before the device interface is also available. (but this was also already stated by @DerAutomatiker "But strange thing is that it stays unavailable and as soon as I reload the integration it's there again. Also in UniFi and Shelly App it's online all the time with no reconnect or so")
You must remember that CoIoT is UDP, packets can be lost and are not retransmitted. Often this problem is caused by network equipment. I myself once had an AP, which after several hours of work lost the CoIoT packets.
I've changed the configuration of my Unify APs to lock this device to its nearest AP. I saw that, sometimes, it connected to other AP. For now, it's been working fine for over 24 hours.
You must remember that CoIoT is UDP, packets can be lost and are not retransmitted. Often this problem is caused by network equipment. I myself once had an AP, which after several hours of work lost the CoIoT packets.
I've changed the configuration of my Unify APs to lock this device to its nearest AP. I saw that, sometimes, it connected to other AP. For now, it's been working fine for over 24 hours.
I not am able to do this at me the device is already connected and linked to closest router, what actually is also the main router and is on a distance of 1.5 meters.... so this seems very unlikely that connection to the router is the problem... more so before the update to 2024.5.2 I never had this error in the almost 2 years I have these shelly motion devices, and besides the HA update nothing changed in my setup, no firmware updates or whatever on Routers or devices itself. All is exactly the same as before the update of HA.
Same issue here, 2 motion sensors and 1 Door/Window2 are affected. The issue has occurred since I updated HA to 2024.5.x.
Error fetching Mechanical Room Motion Sensor data: Sleeping device did not update within 3600 seconds interval
If I reload the integration, it works temporarily but becomes unavailable again after some time, remaining stuck in the unavailable state.
@bieniu, @thecode .... as you can see on screenshots above the motion sensor became unavailable in HA but is reachable and works through webinterface by IP address. Just to backup my argument that it seems very unlikely this is a network error... In the logs I got again the sleeping device not update in 3600 seconds as stated in .y previous comment.
Just want to point out again before the 2024.5.2 update I never had this error message in my logs for last 2 years of use of the shelly devices/integration.
@smarthomefamilyverrips I have looked at all the previous comments and didn't see any logs from you, but you are 100% sure it is "the same problem" and integration related. While I am pretty sure it is integration related in your case and might be related to https://github.com/home-assistant/core/issues/116975 since I seen the same on my setup, without logs I can't promise anything. The fact that for someone it stopped working in a specific release doesn't guarantee it is the same for you. You can either wait until https://github.com/home-assistant/core/issues/116975 is fixed and see if it fixes the problem for you, or provide logs so we can check.
@thecode, in above comment I shared the error and diagnostic files. The reason why I shared the pictures is because there was stated in previous comments that most likely was a connection problem, so that is why I shared some information about the connection status of the device to show that in my case this is unlikely in the hope that somehow it will contribute.... I also did see the issue you referred to but in my non expert view this seemed a other issue and this one seemed more similar, hence I react here. But I hope you are right and that the fixes for that issue will solve and close this issue also, I not find it a problem to wait for that. As far logs go I will only have time to supply these during the weekends, sorry for this. 🫣
Same Problem with my TRVs. After some hours, they become unavailable. The WebUI is still reachable and I can control the devices through it. The CoIoT is setup properly and the firewall is configured to pass all traffic for the complete haos-host (tcp and udp). All the other 9 Shelly Devices work perfect (mostly PM and Dimmer). If I reload the Integration or restart Home Assistant, its available and controllable again and works again for some hours.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
Still is happening and also reported in other issues as for example in #119002
We suspect that the problem may be caused by blocking the event loop by another integration (probably custom one). The CoIoT packet with status reaches the HA server but cannot be processed correctly. To check this, please enable HA built-in debug mode, restart HA and attach here the log file after few hours.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
Not fixed yet