Frequent Socket Errors (read ETIMEDOUT)
What happened?
Zigbee2MQTT is regularly encountering a socket error which causes the docker container to stop running. It seems to happen pretty consistently twice a day, for a while it was happening at exactly 9:04 (AM and PM), but the cadence has changed after changing some settings and stopping/starting the container manually a few times.
What did you expect to happen?
I'd expect the connection to the SLZB-06 to be stable when there is no network problem. Or, if there is a network problem, I'd expect it to reconnect instead of shutting down the container.
The adapter does not have any obvious signs of malfunction either, the only thing it indicates in its logs is "New client [ip address] id: 1" followed by "Client disconnected, id: 0" (the ID numbers alternate).
How to reproduce it (minimal and precise)
I don't have a clear repro step beyond setting up SLZB-06 and installing Zigbee2MQTT in a Docker container. Capturing the traffic between the two may prove difficult because of the long time intervals between the time out errors.
I can provide more information about my network setup if that's useful, but I don't think it's anything particularly out of the ordinary. DHCP leases are 4 hours, there is a single router with 1 subnet.
I have 8 Zigbee devices paired. 7 of them are currently offline. I've been able to reproduce the issue with no devices connected as well. I turned off OTA updates in case that was related, but it didn't change anything.
I'm open to any suggestions :)
Zigbee2MQTT version
2.3.0
Adapter firmware version
20240710
Adapter
SLZB-06
Setup
Docker on Synology DSM 7
Linux [DeviceName] 4.4.302+ #72806 SMP Thu Sep 5 13:44:44 CST 2024 x86_64 GNU/Linux synology_geminilake_224+
Debug log
The fact that z2m disconnects with this error cannot be fixed from z2m itself. It's either an instability of the network or SLZB-06.
Or, if there is a network problem, I'd expect it to reconnect instead of shutting down the container.
For this you can use the watchdog feature
Understood. Oddly enough, after I posted this, the connection was stable for 3 days straight. Strange how it was failing cyclically for a while.
The watchdog feature looks like exactly what I need. I've enabled it on my container. Thank you!