operating-system icon indicating copy to clipboard operation
operating-system copied to clipboard

HA OS 8.1 breaks ZHA with pizigate

Open gyver opened this issue 2 years ago • 9 comments

Describe the issue you are experiencing

After upgrading from 7.6 running 2022.6.3 to 8.1 ZHA stopped working reporting :

Error setting up entry pizigate:/dev/serial0 for zha Traceback (most recent call last): File "/usr/local/lib/python3.9/asyncio/tasks.py", line 490, in wait_for return fut.result() asyncio.exceptions.CancelledError [...]

Updating to Core 2022.6.4 didn't solve the problem. As device names can change with kernel updates, I tried replacing /dev/serial0 with /dev/serial1 (the only other available serial device) and their original names (ttyAMA0 and ttyS0) in config/.storage/core.config_entries with no change in behavior.

Surprisingly downgrading to 7.6 with (ha os update --version 7.6) and even to core 2022.6.3 didn't fix the problem.

Here is the full log for the ZHA loading failure :

Logger: homeassistant.config_entries Source: components/zha/core/gateway.py:173 First occurred: 2:27:37 AM (1 occurrences) Last logged: 2:27:37 AM

Error setting up entry pizigate:/dev/ttyS0 for zha Traceback (most recent call last): File "/usr/local/lib/python3.9/asyncio/tasks.py", line 490, in wait_for return fut.result() asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/zigpy_zigate/api.py", line 133, in command return await asyncio.wait_for( File "/usr/local/lib/python3.9/asyncio/tasks.py", line 492, in wait_for raise exceptions.TimeoutError() from exc asyncio.exceptions.TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/src/homeassistant/homeassistant/config_entries.py", line 339, in async_setup result = await component.async_setup_entry(hass, self) File "/usr/src/homeassistant/homeassistant/components/zha/init.py", line 102, in async_setup_entry await zha_gateway.async_initialize() File "/usr/src/homeassistant/homeassistant/components/zha/core/gateway.py", line 173, in async_initialize self.application_controller = await app_controller_cls.new( File "/usr/local/lib/python3.9/site-packages/zigpy/application.py", line 69, in new await app.startup(auto_form) File "/usr/local/lib/python3.9/site-packages/zigpy_zigate/zigbee/application.py", line 39, in startup await self._api.set_raw_mode() File "/usr/local/lib/python3.9/site-packages/zigpy_zigate/api.py", line 171, in set_raw_mode await self.command(0x0002, data) File "/usr/local/lib/python3.9/site-packages/zigpy_zigate/api.py", line 142, in command raise NoResponseError zigpy_zigate.api.NoResponseError

There are errors later that might be related :

Logger: zigpy_zigate.common Source: /usr/local/lib/python3.9/site-packages/zigpy_zigate/common.py:71 First occurred: 2:27:34 AM (1 occurrences) Last logged: 2:27:34 AM

No module named 'RPi'

Logger: zigpy_zigate.common Source: /usr/local/lib/python3.9/site-packages/zigpy_zigate/common.py:70 First occurred: 2:27:34 AM (1 occurrences) Last logged: 2:27:34 AM

Unable to set PiZiGate GPIO, please check configuration

What operating system image do you use?

rpi4-64 (Raspberry Pi 4/400 64-bit OS)

What version of Home Assistant Operating System is installed?

8.1

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

Not sure of the exact steps to reproduce, this is a nearly 2 year old install that as been kept updated until now.

  1. Use a pizigate with ZHA (needs to add enable_uart=1 in /boot/config.txt and then pretty much plug and play)
  2. Upgrade from 7.6 to 8.1
  3. Lose control of your Zigbee network

Anything in the Supervisor logs that might be useful for us?

Nothing much in supervisor logs. The only warning is most probably unrelated :

22-06-10 03:11:21 WARNING (MainThread) [supervisor.addons.options] Option 'require_ssl' does not exist in the schema for Node-RED (a0d7b954_nodered)

Anything in the Host logs that might be useful for us?

Nothing seems out of the ordinary in the boot log.

System Health information

System Health

version core-2022.6.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.9.12
os_name Linux
os_version 5.10.103-v8
arch aarch64
timezone Europe/Paris
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 4948
Installed Version 1.24.5
Stage running
Available Repositories 1044
Downloaded Repositories 7
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 7.6
update_channel stable
supervisor_version supervisor-2022.05.3
agent_version 1.2.1
docker_version 20.10.9
disk_total 234.1 GB
disk_used 14.8 GB
healthy true
supported true
board rpi4-64
supervisor_api ok
version_api ok
installed_addons SSH & Web Terminal (10.1.1), File editor (5.3.3), InfluxDB (4.4.1), Grafana (7.5.2), WireGuard (0.6.0), Node-RED (11.1.1), Check Home Assistant configuration (3.10.2)
Dashboards
dashboards 2
resources 2
views 11
mode storage
Recorder
oldest_recorder_run June 8, 2022, 1:11 AM
current_recorder_run June 10, 2022, 3:11 AM
estimated_db_size 927.34 MiB
database_engine sqlite
database_version 3.34.1
Spotify
api_endpoint_reachable ok

Additional information

No response

gyver avatar Jun 10 '22 01:06 gyver

Surprisingly downgrading to 7.6 with (ha os update --version 7.6) and even to core 2022.6.3 didn't fix the problem.

That is weird. There most be something else in play here (maybe another addon?). Did you downgrade after editing the serial port?

agners avatar Jun 10 '22 18:06 agners

Yes this is definitely weird and I suspect user error but I'm at a loss about how I could break it and how to fix it. Maybe something else (a previous unrelated upgrade) changed the environment and the change was only activated when the upgrade restarted core.

I started by trying all ports on 8.1. I may not have used core 2022.6.4 for all these tries, beginning my tries with 2022.6.3. I think I tried most combinations though and I'm positive I tried all serial devices (there are only 2, 4 if you count the aliases). Then I reverted to 7.6 and then reverted to the original serial0.

Fortunately I don't have all my eggs in the same basket and most of my Zigbee devices are on another network, managed by Zigbee2Mqtt (running on another Raspberry Pi). I have 2 brand new USB Zigbee coordinators from other manufacturers to test so I plan to migrate the currently unmanaged devices to another new network managed by another Zigbee2Mqtt instance. If there's no obvious way to salvage the current ZHA instance I'll probably reinstantiate ZHA from scratch in an up to date HomeAssistant OS to at least check if this solves my problem.

As I'm not a docker expert and couldn't find documentation on how HAOS manages the docker instances and in particular how ZHA is integrated and supposed to gain access to the /dev/serial0 device I'm a bit in the dark. The "homeassistant" container has a bind mount for /dev marked read-only based on the output of : docker container inspect homeassistant. I'm not sure if it is enough (or even if ZHA is running inside this container and not another). If I have more time I'll try to dive in the internals.

gyver avatar Jun 10 '22 19:06 gyver

We use a bind mount of /dev from the host so we don't have to manage udev rules to add new devices etc. The read-only mount seems not to affect device writes so far. HA Core (and with that ZHA) has full access to all the devices anyways, so there should really not be a problem.

If you have OS level SSH access, I'd suggest to disable ZHA in Core temporarily, and docker exec into the homeassistant container, and try bellows (a zigpy debug utility for the bellows driver) directly:

bellows --device /dev/serial0 --baudrate 115200 info

agners avatar Jun 10 '22 22:06 agners

Done, here is the result :

# docker exec homeassistant bellows --device /dev/serial0 --baudrate 115200 info
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/asyncio/tasks.py", line 490, in wait_for
    return fut.result()
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/bellows", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/bellows/cli/util.py", line 39, in inner
    loop.run_until_complete(f(*args, **kwargs))
  File "/usr/local/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.9/site-packages/bellows/cli/ncp.py", line 69, in info
    s = await util.setup(ctx.obj["device"], ctx.obj["baudrate"])
  File "/usr/local/lib/python3.9/site-packages/bellows/cli/util.py", line 118, in setup
    await s.reset()
  File "/usr/local/lib/python3.9/site-packages/bellows/ezsp/__init__.py", line 98, in reset
    await self._gw.reset()
  File "/usr/local/lib/python3.9/site-packages/bellows/uart.py", line 223, in reset
    return await asyncio.wait_for(self._reset_future, timeout=RESET_TIMEOUT)
  File "/usr/local/lib/python3.9/asyncio/tasks.py", line 492, in wait_for
    raise exceptions.TimeoutError() from exc
asyncio.exceptions.TimeoutError

The command seems to block there indefinitely instead of returning but can be interrupted.

gyver avatar Jun 10 '22 22:06 gyver

@gyver I’m not a ZHA specialist, so can't help you with zha specifics. But I normally lookup my devices via HA->settings->system->hardware there opening the option “all hardware”

erkr avatar Jun 14 '22 17:06 erkr

Note : removing ZHA and reinstalling it doesn't work either (using latest OS and Core). I get the same errors that came separately from the timeout. To be precise :

No module named 'RPi' and Unable to set PiZiGate GPIO

This seems like missing Python module(s) ?

gyver avatar Jun 25 '22 11:06 gyver

I have the same issue (missing RPi.GPIO module) with a fresh install on rpi 3b and a pizigate. I tried upgrading to beta version, that did not solve the issue. Also tried to pip install RPi.GPIO in homeassistant container, it fails compiling because gcc is not a recognized command. Tried to pip install on another rpi running raspbian then scp the modules to homeassistant container and get various errors (and it would not survive any upgrade).

It seem zigpy-zigate needs that module to handle pizigate but I didn't find any version specific requirement. Old versions of RPi.GPIO are not compatible with python3.10 (version included in the beta), I could not go to far in my tests. I will now downgrade and try again.

molusk avatar Jun 30 '22 12:06 molusk

I downgraded core to 2022.5.4 (miss-typed 2022.5.5 so I am now upgrading to that one...), RPi.GPIO module is there and I could configure ZHA with my pizigate. RPi.GPIO has been deleted from homeassistant core version 2022.6.0.

molusk avatar Jun 30 '22 14:06 molusk

Maybe another solution with HACS : https://github.com/thecode/ha-rpi_gpio It should install mandatory rpi.gpio module 😉

molusk avatar Jun 30 '22 23:06 molusk

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 29 '22 00:09 github-actions[bot]

Hi all, I have the same problem. I am using a Conbee II stick and running HA OS as VM on Unraid. ZHA stopped working after an update (I don't remember exactly which).

Here are some evidence of the problem: Hardware riconosciuto The hardware is listed in the HA hardware section.

Conf - 1 ... and listed when trying to reconfigure ZHA Conf - 2 but I still get the unable to connect error.

Here is the log: 2023-06-17 12:00:41.075 DEBUG (MainThread) [zigpy_deconz.uart] Connecting to /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2225577-if00 2023-06-17 12:00:41.076 DEBUG (MainThread) [zigpy_deconz.uart] Connection made 2023-06-17 12:00:41.076 DEBUG (MainThread) [zigpy_deconz.uart] Connected to /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2225577-if00 2023-06-17 12:00:41.076 DEBUG (MainThread) [zigpy_deconz.api] Command Command.read_parameter (1, <NetworkParameter.protocol_version: 34>, b'') 2023-06-17 12:00:41.077 DEBUG (MainThread) [zigpy_deconz.uart] Send: 0x0a02000800010022 2023-06-17 12:00:42.889 WARNING (MainThread) [zigpy_deconz.api] No response to 'Command.read_parameter' command with seq id '0x02' 2023-06-17 12:00:42.890 ERROR (MainThread) [zigpy.application] Couldn't start application Traceback (most recent call last): File "/usr/local/lib/python3.11/asyncio/tasks.py", line 490, in wait_for return fut.result() ^^^^^^^^^^^^ asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/zigpy/application.py", line 193, in startup await self.connect() File "/usr/local/lib/python3.11/site-packages/zigpy_deconz/zigbee/application.py", line 81, in connect self.version = await api.version() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy_deconz/api.py", line 457, in version (self._proto_ver,) = await self.read_parameter( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy_deconz/api.py", line 422, in read_parameter r = await self._command(Command.read_parameter, 1 + len(data), param, data) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/zigpy_deconz/api.py", line 306, in _command return await asyncio.wait_for(fut, timeout=COMMAND_TIMEOUT) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/tasks.py", line 492, in wait_for raise exceptions.TimeoutError() from exc TimeoutError

Some additional infos:

  • I can connect to the stick using Deconz add on, which then I disabled to avoid the known conflicts with ZHA
  • I tried to connect the ID using the /dev/ttyACM0 path as well as by ID.
  • I tried to downgrade the OS but the issue subsisst

It's a while that I am unable to use any Zigbee device... and I am really struggling to find a solution.

Anyone having the same problem managed to fix it? Someone has an idea of what to do?

Thanks for the support!

JMoro-GH avatar Jun 17 '23 10:06 JMoro-GH