core
core copied to clipboard
Zigbee Home Automation stops - zigpy couldn't start application
The problem
ZHA would stop working after ~30mins after startup with the following: Logger: zigpy.application Source: /usr/local/lib/python3.10/site-packages/zigpy/application.py:85 First occurred: 9:30:48 AM (1 occurrences) Last logged: 9:30:48 AM
Couldn't start application
Not entirely what is happening but it appears that it just stops and no attempts are made again after that.
What version of Home Assistant Core has the issue?
2022.7.x
What was the last working version of Home Assistant Core?
2022.6.x
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ZHA
Link to integration documentation on our website
No response
Diagnostics information
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
2022-07-16 09:30:41 INFO (MainThread) [bellows.zigbee.application] EZSP Radio manufacturer:
2022-07-16 09:30:41 INFO (MainThread) [bellows.zigbee.application] EZSP Radio board name:
2022-07-16 09:30:41 INFO (MainThread) [bellows.zigbee.application] EmberZNet version: 6.7.9.0 build 405
2022-07-16 09:30:42 INFO (MainThread) [homeassistant.components.websocket_api.http.connection] [139752900929920] Connection closed by client
2022-07-16 09:30:48 ERROR (MainThread) [zigpy.application] Couldn't start application
2022-07-16 09:30:48 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(None)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/__init__.py", line 417, in _call_connection_lost
self._serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use `await hass.async_add_executor_job()`; This is causing stability issues. Please report issue
Additional information
No response
Have gone back to 2022.6, otherwise my entire zigbee setup is completely unusable so no go. I am not running a wifi zigibee coordinator but a LAN one directly connected to an ubiquiti switch.
I can confirm this issue. EZSP via socket works in 2022.6.7 but no longer in 2022.7.x version because of the above error. Feel free to ask for more debugging if needed.
Same here, but in my case problems seem to have started from any version above 2022.7.0. Rolled back to 2022.7.0 from 2022.7.6 and ZHA is stable again.
Ok, I will double check later today. I may have skipped 2022.7.0
Ok, I will double check later today. I may have skipped 2022.7.0
I'm thinking (but not sure) that 2022.7.1 was fine too, but I'm 100% sure 2022.7.0 is indeed fine as that's what I rolled back to.
A little follow up: over the last hour or two, I've been gradually upgrading HA Core starting with 2022.6.7 until 2022.7.6 and oddly enough, now everything is working as expected. No errors in the logs and the connection to the Zigbee coordinator seems to be stable.
So I'm not sure what to make of this...
Oh bummer, no luck. Still running 2022.7.0 and ZHA is stuck again.
2022-07-21 13:24:50 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2022-07-21 13:24:50 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(None)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/__init__.py", line 417, in _call_connection_lost
self._serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use `await hass.async_add_executor_job()`; This is causing stability issues. Please report issue
2022-07-21 13:25:31 ERROR (MainThread) [zigpy.application] Couldn't start application
2022-07-21 13:25:31 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(None)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/__init__.py", line 417, in _call_connection_lost
self._serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use `await hass.async_add_executor_job()`; This is causing stability issues. Please report issue
We're now almost a day later and I haven't seen the problem reappear. I have a feeling I am just lucky and that bit of troublesome code was not (yet?) triggered for some reason.
Just for reference, here's what I did and what seems to have fixed the issue:
- Initially I was on HA core 2022.7.6
- in the SSH terminal I downgraded to 2022.6.7 and next upgraded one version at a time:
ha core update --version 2022.6.7ha core update --version 2022.7.0ha core update --version 2022.7.1ha core update --version 2022.7.2ha core update --version 2022.7.3ha core update --version 2022.7.4ha core update --version 2022.7.5ha core update --version 2022.7.6
On every step the Zigbee integration kept working.
We're now almost a day later and I haven't seen the problem reappear. I have a feeling I am just lucky and that bit of troublesome code was not (yet?) triggered for some reason.
Thanks Peter! After the ZHA lockup yesterday I restarted HA (via YAML dev-tools), and since then it's been running without any problems. So maybe the lockup on my side was incidental, I'll keep monitoring.
Status update: ZHA kept locking up multiple times a day last weekend. So I've downgraded to 2022.6.7 last sunday, havn't had any lockups yet.
I haven't experienced any issues, but as I said before, I think that's just luck. I don't have the possibility to dive into the code right now, but I have a feeling there is a bit of code that is triggered when an error condition happens that is not properly handling the error. Probably since 2022.7.0. But that's a lot of guesswork.
I haven't experienced any issues, but as I said before, I think that's just luck. I don't have the possibility to dive into the code right now, but I have a feeling there is a bit of code that is triggered when an error condition happens that is not properly handling the error. Probably since 2022.7.0. But that's a lot of guesswork.
I have the feeling you are right. I'm using a Sonoff Zigbee Bridge loaded with Tasmota, connected to ZHA via WiFi, so when the connection between ZHA and Sonoff ZB Bridge is unstable ZHA has to reconnect. It seems as if from version 2022.7.0 and upwards the reconnect is having problems causing ZHA to crash.
I'm running into this as well but using a tube-zb-gw-efr32 running over ethernet so the connection is rock solid. From the device, I'm just seeing Home Assistant drop the connection with the same message. I have to restart home assistant to get it to work again (drops approx. every 30-min)
[D][streamserver:055]: Client 10.1.10.2 disconnected
I'm running into this as well but using a tube-zb-gw-efr32 running over ethernet so the connection is rock solid. From the device, I'm just seeing Home Assistant drop the connection with the same message. I have to restart home assistant to get it to work again (drops approx. every 30-min)
[D][streamserver:055]: Client 10.1.10.2 disconnected
Looks like we have the exact same issue.
I am having the same problem, "Couldn't start application" error after some time, ZHA network is down which is resolved by a restart. Not sure exactly what version it was introduced but downgraded to 2022.6 and it is back to reliable.
@s2d4 @frenck can someone please add the integration:zha tag so the proper developer can look into this?
zha documentation zha source (message by IssueLinks)
Hey there @dmulcahey, @adminiuga, @puddly, mind taking a look at this issue as it has been labeled with an integration (zha) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
I've not been able to reproduce the issue today and have upgraded to 2022.8. Will report back if I see it again.
I was able to reproduce the error yesterday and provided the logs to @puddly. Thanks for the new version, will test as soon as it is released!
I also have the same issue, it keeps restarting my mqtt devices every couple of hours sometimes 3 hours sometimes more:
2022-08-05 18:01:45.499 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-05 18:01:55.974 ERROR (bellows.thread_0) [bellows.uart] Lost serial connection: read failed: [Errno 104] Connection reset by peer
2022-08-05 18:01:55.978 ERROR (MainThread) [bellows.ezsp] NCP entered failed state. Requesting APP controller restart
2022-08-05 18:01:55.982 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(SerialExcepti...eset by peer'))
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/__init__.py", line 417, in _call_connection_lost
self._serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async_.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use `await hass.async_add_executor_job()`; This is causing stability issues. Please report issue
2022-08-05 18:02:04.044 INFO (MainThread) [bellows.zigbee.application] EZSP Radio manufacturer:
2022-08-05 18:02:04.044 INFO (MainThread) [bellows.zigbee.application] EZSP Radio board name:
2022-08-05 18:02:04.045 INFO (MainThread) [bellows.zigbee.application] EmberZNet version: 6.7.8.0 build 373
2022-08-05 18:02:11.774 INFO (MainThread) [zigpy.device] [0x0000] Requesting 'Node Descriptor'
2022-08-05 18:02:12.355 INFO (MainThread) [zigpy.device] [0x0000] Got Node Descriptor: NodeDescriptor(logical_type=<LogicalType.Coordinator: 0>, complex_descriptor_available=0, user_descriptor_available=0, reserved=0, aps_flags=0, frequency_band=<FrequencyBand.Freq2400MHz: 8>, mac_capability_flags=<MACCapabilityFlags.AllocateAddress|RxOnWhenIdle|MainsPowered|FullFunctionDevice|AlternatePanCoordinator: 143>, manufacturer_code=43981, maximum_buffer_size=82, maximum_incoming_transfer_size=128, server_mask=11329, maximum_outgoing_transfer_size=128, descriptor_capability_field=<DescriptorCapability.NONE: 0>, *allocate_address=True, *is_alternate_pan_coordinator=True, *is_coordinator=True, *is_end_device=False, *is_full_function_device=True, *is_mains_powered=True, *is_receiver_on_when_idle=True, *is_router=False, *is_security_capable=False)
2022-08-05 18:02:12.356 INFO (MainThread) [zigpy.device] [0x0000] Already have endpoints: {0: <bellows.zigbee.device.EZSPZDOEndpoint object at 0x9c5580b8>, 1: <EZSPEndpoint id=1 in=[] out=[] status=<Status.NEW: 0>>}
2022-08-05 18:02:12.356 INFO (MainThread) [zigpy.device] [0x0000] Initializing endpoints [<EZSPEndpoint id=1 in=[] out=[] status=<Status.NEW: 0>>]
2022-08-05 18:02:12.356 INFO (MainThread) [zigpy.endpoint] [0x0000:1] Discovering endpoint information
2022-08-05 18:02:12.638 INFO (MainThread) [zigpy.endpoint] [0x0000:1] Discovered endpoint information: SizePrefixedSimpleDescriptor(endpoint=1, profile=260, device_type=1024, device_version=0, input_clusters=[0, 6, 10, 25, 1281], output_clusters=[1, 32, 1280, 1282])
2022-08-05 18:02:12.640 INFO (MainThread) [zigpy.device] [0x0000] Already have model and manufacturer info
2022-08-05 18:02:12.640 INFO (MainThread) [zigpy.device] [0x0000] Discovered basic device information for <EZSPCoordinator model='EZSP' manuf='Silicon Labs' nwk=0x0000 ieee=60:a4:23:ff:fe:08:ec:ed is_initialized=True>
2022-08-05 18:02:39.346 INFO (MainThread) [homeassistant.components.websocket_api.http.connection] [2590052328] Connection closed by client
@PirvuCatalin This issue is regarding the process crashing and not restarting and the Zigbee network being down until the user takes some action such as restarting Home Assistant that began with 2022.07.x releases. In your log it appears to restart correctly.
I am getting off topic, but if you're using a Zigbee coordinator that connects via WiFi the connection can in some case be more delicate due to the inherent nature of wireless networks. Low signal strength, too many clients on an access point, co-channel interference, etc. In addition WiFi and Zigbee use overlapping frequencies and even though Zigbee has much lower power levels it is possible in some installations it causes it's own interference. The EZSP protocol is rather sensitive to latency/jitter and dropped packets. Like I said this is getting off topic, but I would start with checking your wifi access point for the RSSI value and any indication of short connection time or connection drops. Even if you're using an ethernet controller the topology of your network could create similar connection drops. I have also seen NCP entered failed state messages from USB devices, for e.g. a Raspberry Pi with an inadequate power supply. Follow up with the Home Assistant Discord or Forums for further troubleshooting.
Thanks @Andrew-Joakimsen , I understood what you said above, but that's not the case, it happened after I've upgraded, no changes in my setup or network. I'll probably downgrade or change my zigbee coord
Please try out the latest HA Core release, 2022.8.2.
Please try out the latest HA Core release, 2022.8.2.
Thanks @puddly. I've just updated to 2022.8.2 without any issues. Will keep you posted about ZHA stability.
Please try out the latest HA Core release, 2022.8.2.
Thanks @puddly. I've just updated to
2022.8.2without any issues. Will keep you posted about ZHA stability.
Still running smoothly. ZHA disconnected during the night (because my Sonoff ZB Bridge has a scheduled reboot at 05:00AM) but reconnected right after without any problems, so far so good.
Thank you for looking into this @puddly
Unfortunately still not working in 2022.8.3, rolling back to 2022.6.x
I should emphasise that mine is not a wifi device and this issues was not opened with regards to known NCP problems with the sonoff ZBBridge.
Complete zigbee outage after the following logs, which differs in content from the logs received on 2022.7.x but a complete outage at the end nevertheless.
"2022-08-10 17:18:23.380 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:18:38.382 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:18:53.384 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:19:08.386 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:19:23.389 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:19:23.392 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(None)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/init.py", line 417, in call_connection_lost
self.serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use await hass.async_add_executor_job(); This is causing stability issues. Please report issue
2022-08-10 17:19:25.038 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:25.041 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:25.062 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:19:35.880 WARNING (MainThread) [bellows.zigbee.application] ControllerApplication reset unsuccessful: TimeoutError()
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 652, in _reset_controller_loop
await self._reset_controller()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 670, in _reset_controller
await self.startup()
File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 66, in startup
await self.connect()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 134, in connect
self._ezsp = await bellows.ezsp.EZSP.initialize(self.config)
File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/init.py", line 86, in initialize
await ezsp._protocol.initialize(zigpy_config)
File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/protocol.py", line 93, in initialize
_, conf_buffers = await self.getConfigurationValue(
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
2022-08-10 17:19:42.530 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:42.531 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:42.532 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:42.578 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:19:42.579 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:19:48.128 WARNING (MainThread) [bellows.zigbee.application] ControllerApplication reset unsuccessful: TimeoutError()
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 652, in _reset_controller_loop
await self._reset_controller()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 670, in _reset_controller
await self.startup()
File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 66, in startup
await self.connect()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 134, in connect
self._ezsp = await bellows.ezsp.EZSP.initialize(self.config)
File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/init.py", line 86, in initialize
await ezsp._protocol.initialize(zigpy_config)
File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/protocol.py", line 76, in initialize
await self._cfg(self.types.EzspConfigId[config], value)
File "/usr/local/lib/python3.10/site-packages/bellows/ezsp/protocol.py", line 35, in _cfg
(status,) = await self.setConfigurationValue(config_id, value)
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
2022-08-10 17:19:54.783 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:54.784 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:54.785 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:54.786 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:19:54.828 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:19:54.829 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:19:54.839 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:20:55.245 ERROR (MainThread) [homeassistant] Error doing job: Task was destroyed but it is pending!
2022-08-10 17:21:16.371 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:21:31.374 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:21:46.378 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:22:01.381 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:22:16.384 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2022-08-10 17:22:16.387 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(None)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/init.py", line 417, in call_connection_lost
self.serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use await hass.async_add_executor_job(); This is causing stability issues. Please report issue
2022-08-10 17:22:18.031 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:22:18.031 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:22:18.032 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:22:18.032 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2022-08-10 17:22:18.076 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:22:18.077 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:22:18.095 WARNING (MainThread) [bellows.ezsp.protocol] Unknown application frame 0x0208 received: b'9067' (b'00800008029067'). This is a bug!
2022-08-10 17:22:43.118 ERROR (MainThread) [zigpy.application] Couldn't start application
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in startup
await self.start_network()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 210, in start_network
await self.multicast.startup(ezsp_device)
File "/usr/local/lib/python3.10/site-packages/bellows/multicast.py", line 40, in startup
await self._initialize()
File "/usr/local/lib/python3.10/site-packages/bellows/multicast.py", line 29, in _initialize
status, entry = await self._ezsp.getMulticastTableEntry(i)
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError
2022-08-10 17:22:43.124 ERROR (bellows.thread_0) [homeassistant] Error doing job: Exception in callback SerialTransport._call_connection_lost(None)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.10/site-packages/serial_asyncio/init.py", line 417, in call_connection_lost
self.serial.close()
File "/usr/local/lib/python3.10/site-packages/serial/urlhandler/protocol_socket.py", line 104, in close
time.sleep(0.3)
File "/usr/src/homeassistant/homeassistant/util/async.py", line 180, in protected_loop_func
check_loop(func, strict=strict)
File "/usr/src/homeassistant/homeassistant/util/async.py", line 141, in check_loop
raise RuntimeError(
RuntimeError: Detected blocking call to sleep inside the event loop. Use await hass.async_add_executor_job(); This is causing stability issues. Please report issue
2022-08-10 17:22:43.126 WARNING (MainThread) [bellows.zigbee.application] ControllerApplication reset unsuccessful: TimeoutError()
Traceback (most recent call last):
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 456, in wait_for
return fut.result()
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 652, in _reset_controller_loop
await self._reset_controller()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 670, in _reset_controller
await self.startup()
File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 91, in startup
await self.start_network()
File "/usr/local/lib/python3.10/site-packages/bellows/zigbee/application.py", line 210, in start_network
await self.multicast.startup(ezsp_device)
File "/usr/local/lib/python3.10/site-packages/bellows/multicast.py", line 40, in startup
await self._initialize()
File "/usr/local/lib/python3.10/site-packages/bellows/multicast.py", line 29, in _initialize
status, entry = await self._ezsp.getMulticastTableEntry(i)
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 458, in wait_for
raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError"
Please try adding this to your Home Assistant configuration and restart HA:
zha:
zigpy_config:
backup_period: 86400 # remove this after 2022.8.4 is released!
There is a bug with how frequently automatic network backups are performed (new in 2022.8.0), which causes a lot of traffic to your coordinator on startup and then once every 24 minutes, instead of once every 24 hours.
It will not solve the underlying problem with your coordinator "breaking" when there is a sustained burst of data (likely a problem with the ESP8266/ESP32 firmware) but it will at least reduce the frequency of these disconnects until we can figure that other problem out.
The error persists in 2022.8.4. I also tried using an update copy of the bellows module and disabled the bellows UART thread https://github.com/zigpy/bellows/blob/cb11d6cf9175b3a924a0c89ce57c93e4a54dd63c/bellows/uart.py#L357 by changing the True to a False.
Both the blocking sleep call and reconnecting during reconnect failures have been addressed in 2022.8.7. I've been testing with Tube's EFR32 gateway and have only been able to crash it once, with reconnection happening properly afterwards.
I've also added a config option to reduce the number of consecutive watchdog failures needed to handle a disconnect:
zha:
zigpy_config:
max_watchdog_failures: 4 # you can reduce this to 1 to immediately trigger a reconnect
Let me know if this release is more stable.
I believe all of the problems related to this issue have been addressed. Can anyone affected by this and running 2022.8.7 (or newer) confirm that stability has improved?
I believe all of the problems related to this issue have been addressed. Can anyone affected by this and running 2022.8.7 (or newer) confirm that stability has improved?
Thanks for all the hard work @puddly!
I cannot confirm this as I've moved over to Z2M in the meanwhile. Hopefully somebody else can confirm the stability.
@puddly Currently I have Home Assistant 2022.9.1 version installed and I have issues with enabling "Zigbee Home Automation". The issue is
2022-09-10 13:18:22.689 ERROR (MainThread) [zigpy.application] Couldn't start application
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 652, in _skip_bootloader
result = await responses.get()
File "/usr/local/lib/python3.10/asyncio/queues.py", line 159, in get
await getter
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/zigpy/application.py", line 106, in startup
await self.connect()
File "/usr/local/lib/python3.10/site-packages/zigpy_znp/zigbee/application.py", line 111, in connect
await znp.connect()
File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 694, in connect
self.capabilities = (await self._skip_bootloader()).Capabilities
File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 651, in _skip_bootloader
async with async_timeout.timeout(CONNECT_PROBE_TIMEOUT):
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 129, in __aexit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.10/site-packages/async_timeout/__init__.py", line 212, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
2022-09-10 13:18:27.822 WARNING (MainThread) [homeassistant.components.zha.core.gateway] Couldn't start ZNP = Texas Instruments Z-Stack ZNP protocol: CC253x, CC26x2, CC13x2 coordinator (attempt 1 of 3)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/zigpy_znp/api.py", line 652, in _skip_bootloader
result = await responses.get()
File "/usr/local/lib/python3.10/asyncio/queues.py", line 159, in get
await getter
asyncio.exceptions.CancelledError
Is this simillar to the previous issue?