zigbee2mqtt icon indicating copy to clipboard operation
zigbee2mqtt copied to clipboard

Broadcasts failing on ember after migration

Open julien-billaud opened this issue 10 months ago โ€ข 139 comments

What happened?

While I've never been facing any issues for more than a year with the Sonoff Dongle-e + ezsp driver, I've tried to change the driver to ember, but nothing is working (tried multiple time) but sometime losing all the devices, sometime they are still there but impossible to interact with them, and pairing is never working. (for now I returned to the ezsp driver). I'm not noticing much error in the log (only the broadcast error reported here https://github.com/Koenkk/zigbee2mqtt/issues/22445)

I've tried the exact same configuration on a regular x86 computer running debian (using the same zigbee dongle) and didn't face any issue which seems to be a linked with the Raspberry pi 4

What did you expect to happen?

No response

How to reproduce it (minimal and precise)

switch from eszp to ember driver

Zigbee2MQTT version

1.37.0

Adapter firmware version

7.4.2.0 build 0

Adapter

Sonoff dongle-e

Setup

Raspberry pi 4 using docker image

Debug log

No response

julien-billaud avatar May 04 '24 20:05 julien-billaud

Any chance you can downgrade to 7.4.1 and see if you still have those problems on the pi?

Nerivec avatar May 04 '24 21:05 Nerivec

Same problem with SLZB-06M

But I don't have a raspberry pi 4, host is a x86 machine, running unraid and zigbee2mqtt in docker.

fir3drag0n avatar May 04 '24 21:05 fir3drag0n

Grouping the mentioned broadcasting issue here guys (https://github.com/Koenkk/zigbee2mqtt/issues/22445, https://github.com/Koenkk/zigbee2mqtt/issues/22398) @supaeasy @alainsch @Ricc68 @VladislavVesely @luqsq

I cannot reproduce this with my Dongle-E. I've tried various firmware, various ways to migrate from ezsp to ember (even bad ones ๐Ÿ˜…). Can you guys think of something that may be different in your setup from a "regular setup"?

Nerivec avatar May 04 '24 22:05 Nerivec

Same problem with SLZB-06M

But I don't have a raspberry pi 4, host is a x86 machine, running unraid and zigbee2mqtt in docker.

adapter: ember rtscts: false

May need to add 'rtscts' below adapter setting.

raphael1688 avatar May 05 '24 04:05 raphael1688

Can you guys think of something that may be different in your setup from a "regular setup"?

Two things: I recently installed https://www.zigbee2mqtt.io/devices/ZFP-1A-CH.html#siglis-zfp-1a-ch

Wich I think is not a very common router. Swiss market only and most likely not very popular. Initially I had problems with it. Also shortly after I installed it, my second Dongle-E that I use as a router had to re-pair and this was one of the first devices in my 2yo network that I never had any problems with.

Second: Shortly before my Router Dongle failed I set reporting interval of every lamp to 1-3 seconds because I didn't see lamps status change quickly enough (or at all) when pressing a HW button like the switches mentioned above. After the Dongle failed I reverted this to 1-30 s and had no problems since. But I did the reverting before I saw the error in logs.

Also I have to say: I don't recognize bigger problems or misbehavior. I just saw the error in the logs. The only real problem I have is that sometimes (not reproducible) some IKEA Bulbs are starting in maximum dimmed mode even though at least one of them is never dimmed manually.

supaeasy avatar May 05 '24 06:05 supaeasy

Grouping the mentioned broadcasting issue here guys (#22445, #22398) @supaeasy @alainsch @Ricc68 @VladislavVesely @luqsq

I cannot reproduce this with my Dongle-E. I've tried various firmware, various ways to migrate from ezsp to ember (even bad ones ๐Ÿ˜…). Can you guys think of something that may be different in your setup from a "regular setup"?

As the dongle-e is working using a docker images on an x86 environnement I'm guessing there is no issue with the zigbee Dongle, so if I focus on some specifics configs, here is what's coming to my mind as part of the change that might be different than a regular installation :

  • RPI4 using a Argon One case
  • RPI4 is booting from an SSD which is plugged to the USB3 port just bellow the Zigbee dongle
  • 64 bit is enabled for that OS
  • The persistent data of the container are stored on an encrypted (Luks volume) which is being mounted on boot

everything else is quite standard in my opinion.

julien-billaud avatar May 05 '24 06:05 julien-billaud

Nothing special over here. Had 1.36 running with SLZD-06M running on zigbee FW 20231030. Everything was running OK with adapter: ezsp

Did the following steps:

  • upgraded addon to 1.37

  • received the "zh:ezsp: Deprecated driver 'ezsp' currently in use, 'ember' will become the..." messages

  • changed adapter: ezsp to adapter: ember and restarted

  • got an error that my coordinator was not on EZSP13

  • upgraded my coordinator firmware to FW 20240408

  • as adviced by SMLight, changed config in zigbee2mqtt to "adapter: ember" + "rtscts: false"

  • restarted zigbee2mqtt and zigbee network is working

  • now at startup I get the message "zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":0,"clusterId":19,"sourceEndpoint":0,"destinationEndpoint":0,"options":0,"groupId":0,"sequence":212} messageTag=255]"

  • pairing new entities does not work due to the same error

  • switching back to "adapter: ezsp" doesn't work either as I then get the error "zh:controller:greenpower: Received undefined command from '0'". another used already created a ticket for this.

So currently I'm in a state that my network is running, but I can't add any new devices.

Is there any more info we can provide?

alainsch avatar May 05 '24 07:05 alainsch

Oh I should have mentioned that I am running HAOS in a VM on Synology DSM 7.2.

Interference should not be a problem as my dongle is in a USB 2 port with a 2 m extension cable.

supaeasy avatar May 05 '24 07:05 supaeasy

My setup is HAOS running on a ODROID M1 with 8GB RAM and 512 GB SSD.

alainsch avatar May 05 '24 07:05 alainsch

Nothing special over here. Had 1.36 running with SLZD-06M running on zigbee FW 20231030. Everything was running OK with adapter: ezsp

Did the following steps:

  • upgraded addon to 1.37
  • received the "zh:ezsp: Deprecated driver 'ezsp' currently in use, 'ember' will become the..." messages
  • changed adapter: ezsp to adapter: ember and restarted
  • got an error that my coordinator was not on EZSP13
  • upgraded my coordinator firmware to FW 20240408
  • as adviced by SMLight, changed config in zigbee2mqtt to "adapter: ember" + "rtscts: false"
  • restarted zigbee2mqtt and zigbee network is working
  • now at startup I get the message "zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":0,"clusterId":19,"sourceEndpoint":0,"destinationEndpoint":0,"options":0,"groupId":0,"sequence":212} messageTag=255]"
  • pairing new entities does not work due to the same error
  • switching back to "adapter: ezsp" doesn't work either as I then get the error "zh:controller:greenpower: Received undefined command from '0'". another used already created a ticket for this.

So currently I'm in a state that my network is running, but I can't add any new devices.

Is there any more info we can provide?

Exactly the same behavior. Plus the problem that no new devices can't be paired with ember. But with ezsp I can add devices. In my case especially all my routers get disconnected.

fir3drag0n avatar May 05 '24 07:05 fir3drag0n

I do have 4 mmwave presence sensors. Maybe these devices have an influence.

fir3drag0n avatar May 05 '24 07:05 fir3drag0n

Sorry, posted my follow-up on the wrong ticket...

These are the messages I see when I startup Zigbee2MQTT. Maybe they are related.

[2024-05-05 11:00:43] info: z2m: Logging to console, file (filename: log.log) [2024-05-05 11:00:49] info: z2m: Starting Zigbee2MQTT version 1.37.0 (commit #unknown) [2024-05-05 11:00:49] info: z2m: Starting zigbee-herdsman (0.45.0) [2024-05-05 11:00:49] info: zh:ember: ======== Ember Adapter Starting ======== [2024-05-05 11:00:49] info: zh:ember:ezsp: ======== EZSP starting ======== [2024-05-05 11:00:49] info: zh:ember:uart:ash: ======== ASH NCP reset ======== [2024-05-05 11:00:49] info: zh:ember:uart:ash: Socket ready [2024-05-05 11:00:49] info: zh:ember:uart:ash: ======== ASH starting ======== [2024-05-05 11:00:51] info: zh:ember:uart:ash: ======== ASH connected ======== [2024-05-05 11:00:51] info: zh:ember:uart:ash: ======== ASH started ======== [2024-05-05 11:00:51] info: zh:ember:ezsp: ======== EZSP started ======== [2024-05-05 11:00:51] warning: zh:ember: [EzspConfigId] Failed to SET "ADDRESS_TABLE_SIZE" TO "16" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 11:00:51] warning: zh:ember: [EzspConfigId] Failed to SET "APS_UNICAST_MESSAGE_COUNT" TO "32" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 11:00:51] warning: zh:ember: [EzspConfigId] Failed to SET "NEIGHBOR_TABLE_SIZE" TO "26" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 11:00:51] warning: zh:ember: [EzspConfigId] Failed to SET "SOURCE_ROUTE_TABLE_SIZE" TO "200" with status=ERROR_INVALID_VALUE. Firmware value will be used instead. [2024-05-05 11:00:51] warning: zh:ember: [EzspConfigId] Failed to SET "MULTICAST_TABLE_SIZE" TO "16" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 11:00:51] info: zh:ember: [STACK STATUS] Network up. [2024-05-05 11:00:51] info: zh:ember: [INIT TC] NCP network matches config. [2024-05-05 11:00:51] info: zh:ember: [CONCENTRATOR] Started source route discovery. 1247ms until next broadcast. [2024-05-05 11:00:51] info: z2m: zigbee-herdsman started (resumed) [2024-05-05 11:00:51] info: z2m: Coordinator firmware version: '{"meta":{"build":0,"ezsp":13,"major":7,"minor":4,"patch":1,"revision":"7.4.1 [GA]","special":0,"type":170},"type":"EmberZNet"}' [2024-05-05 11:00:51] info: z2m: Currently 12 devices are joined: ...

[2024-05-05 11:00:51] info: z2m: Zigbee: disabling joining new devices. [2024-05-05 11:00:51] info: z2m: Connecting to MQTT server at mqtt://core-mosquitto:1883 [2024-05-05 11:00:52] info: z2m: Connected to MQTT server [2024-05-05 11:00:52] info: z2m: Started frontend on port 8099 [2024-05-05 11:00:53] info: z2m: Zigbee2MQTT started! [2024-05-05 11:01:11] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":31,"sourceEndpoint":0,"destinationEndpoint":0,"options":0,"groupId":0,"sequence":0} messageTag=255] [2024-05-05 11:01:23] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":31,"sourceEndpoint":0,"destinationEndpoint":0,"options":0,"groupId":0,"sequence":0} messageTag=255] [2024-05-05 11:01:33] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":31,"sourceEndpoint":0,"destinationEndpoint":0,"options":0,"groupId":0,"sequence":0} messageTag=255]

Whenever I try to start the pairing process, I see these messages:

[2024-05-05 11:03:28] info: z2m: Zigbee: allowing new devices to join. [2024-05-05 11:03:28] info: zh:ember: [STACK STATUS] Network opened. [2024-05-05 11:03:29] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":54,"sourceEndpoint":0,"destinationEndpoint":0,"options":256,"groupId":0,"sequence":240} messageTag=2] [2024-05-05 11:03:29] error: zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":41440,"clusterId":33,"sourceEndpoint":242,"destinationEndpoint":242,"options":256,"groupId":0,"sequence":241} messageTag=3]

alainsch avatar May 05 '24 09:05 alainsch

@alainsch I also had a discussion with @Nerivec at discord, because I also have the same stillsaying error message.

fir3drag0n avatar May 05 '24 09:05 fir3drag0n

Exactly the same behavior. Plus the problem that no new devices can't be paired with ember. But with ezsp I can add devices. In my case especially all my routers get disconnected.

Ah yes, and I wasn't aware it is related...

I have a SLZB-06M as coordinator (groundfloor) and a Sonoff Dongle-E flashed as router (first floor). Yesterday evening my Sonoff router got disconnected. It is while trying to pair it again that I found out I couldn't pair any devices.

I have a very small zigbee network (more a test setup here), so I have no other routers, only end devices.

alainsch avatar May 05 '24 09:05 alainsch

@alainsch I also had a discussion with @Nerivec at discord, because I also have the same stillsaying error message.

I'm pretty new to discord, I'll try to find the channel (?) so I can follow the discussion.

alainsch avatar May 05 '24 09:05 alainsch

Exactly the same behavior. Plus the problem that no new devices can't be paired with ember. But with ezsp I can add devices. In my case especially all my routers get disconnected.

Ah yes, and I wasn't aware it is related...

I have a SLZB-06M as coordinator (groundfloor) and a Sonoff Dongle-E flashed as router (first floor). Yesterday evening my Sonoff router got disconnected. It is while trying to pair it again that I found out I couldn't pair any devices.

I have a very small zigbee network (more a test setup here), so I have no other routers, only end devices.

I already have nearly 70 devices...

fir3drag0n avatar May 05 '24 09:05 fir3drag0n

I already have nearly 70 devices...

Here at home, HA is a small setup (12 devices) I use mainly for testing. But in our vacation home, everything is controlled by HA and we have 51 zigbee and 33 ESPHome devices.

In this second setup, I also have the same SLZB-06M coordinator, but still on the older 20231030 firmware, where the adapter is still defined as 'adapter: ezsp'.

Since I ugraded to 1.37, I couldn't pair any new devices too, due to another error: "zh:controller:greenpower: Received undefined command from '0'"

And that setup is not a test setup :-(

alainsch avatar May 05 '24 09:05 alainsch

@alainsch I also had a discussion with @Nerivec at discord, because I also have the same stillsaying error message.

I'm pretty new to discord, I'll try to find the channel (?) so I can follow the discussion.

In the development-branch channel. The similarity we both have is the same coordinator (I am at the dev Firmware right now). But maybe you can rather rule out the cause if you only have 12 devices in your setup.

fir3drag0n avatar May 05 '24 09:05 fir3drag0n

Very very simple configuration here.

HAOS on qemu VM in low end x86-64 QNAP nas, resources 2 cpu+2 GB ram as suggested by HAOS setup guide. I have seen a lot of ppl using VMs or arm devices: one common point may be low resources in terms of CPU power and/or RAM.

Back to the setup, I can report two setups:

  1. ZBDongle-E with fw 7.4.2, Z2M 1.37.0, ember driver. Only the ZBDongle-E is in the ZigBee network so it is only the coordinator. The broadcast errors happens. This may rule out the devices and spot the light on the coordinator.
  2. ZBDongle-E as above in above setup but with 2 Sonoff TRVZB valves added to the ZigBee network: same error continues to happen. But since it was happening with the coordinator alone as for setup 1, I would rule out the fact that I have added the 2 devices.

Anyway I see from other posts that the error is happening with a variety of devices and if I look at another common factor, all the variety of networks showing the error have -> a coordinator <- which again spots the light on the coordinator.

I see that @Nerivec is not able to reproduce the issue, and, needless to say, also Nerivec is working with a coordinator which should obviously rule out the coordinator itself (unless there is some elusive coordinator hardware common factor), maybe a good starting point for you would be to constrain the system on a low resource/slow host or a VM with limited resources to see what happens with the coordinator handling of Z2M.

Maybe another hint maybe found in the first post from @julien-billaud: "I've tried the exact same configuration on a regular x86 computer running debian (using the same zigbee dongle) and didn't face any issue which seems to be a linked with the Raspberry pi 4".

Ricc68 avatar May 05 '24 10:05 Ricc68

OK, because my setup is a small setup mainly for test, I did the following steps:

  • removed zigbee2mqtt addon
  • removed the zigbee2mqtt folder from my config
  • re-installed zigbee2mqtt with the SLZB-06M and "adapter:ezsp"
  • startup and got the following messages

[12:01:03] INFO: Preparing to start... [12:01:04] INFO: Socat not enabled [12:01:10] INFO: Starting Zigbee2MQTT... [2024-05-05 12:01:14] info: z2m: Logging to console, file (filename: log.log) [2024-05-05 12:01:20] info: z2m: Starting Zigbee2MQTT version 1.37.0 (commit #unknown) [2024-05-05 12:01:20] info: z2m: Starting zigbee-herdsman (0.45.0) [2024-05-05 12:01:20] warning: zh:ezsp: Deprecated driver 'ezsp' currently in use, 'ember' will become the officially supported EmberZNet driver in next release. If using Zigbee2MQTT see https://github.com/Koenkk/zigbee2mqtt/discussions/21462 [2024-05-05 12:01:24] info: zh:ezsp:driv: Leaving current network and forming new network [2024-05-05 12:01:25] info: zh:ezsp:driv: Form network [2024-05-05 12:01:26] info: zh:controller: Wrote coordinator backup to '/config/zigbee2mqtt/level_0/coordinator_backup.json' [2024-05-05 12:01:26] info: z2m: zigbee-herdsman started (reset) [2024-05-05 12:01:26] info: z2m: Coordinator firmware version: '{"meta":{"maintrel":"1 ","majorrel":"7","minorrel":"4","product":13,"revision":"7.4.1.0 build 0"},"type":"EZSP v13"}' [2024-05-05 12:01:26] info: z2m: Currently 0 devices are joined: [2024-05-05 12:01:26] info: z2m: Zigbee: disabling joining new devices. [2024-05-05 12:01:27] info: z2m: Connecting to MQTT server at mqtt://core-mosquitto:1883 [2024-05-05 12:01:27] info: z2m: Connected to MQTT server [2024-05-05 12:01:28] info: z2m: Started frontend on port 8099 [2024-05-05 12:01:28] info: z2m: Zigbee2MQTT started!

  • when I try to start pairing a device, I see

[2024-05-05 12:01:40] info: z2m: Zigbee: allowing new devices to join. [2024-05-05 12:01:41] error: zh:controller:greenpower: Received undefined command from '0' [2024-05-05 12:02:00] info: zh:controller: Interview for '0x00158d0008083d2a' started [2024-05-05 12:02:00] info: z2m: Device '0x00158d0008083d2a' joined [2024-05-05 12:02:00] info: z2m: Starting interview of '0x00158d0008083d2a' [2024-05-05 12:02:11] info: zh:controller: Succesfully interviewed '0x00158d0008083d2a' [2024-05-05 12:02:11] info: z2m: Successfully interviewed '0x00158d0008083d2a', device has successfully been paired [2024-05-05 12:02:11] info: z2m: Device '0x00158d0008083d2a' is supported, identified as: Aqara Motion sensor (RTCGQ11LM) [2024-05-05 12:02:11] info: z2m: Configuring '0x00158d0008083d2a' [2024-05-05 12:02:11] info: z2m: Successfully configured '0x00158d0008083d2a'

  • so pairing is possible in "adapter:ezsp" mode. Removed the device...

[2024-05-05 12:02:19] info: z2m: Removing device '0x00158d0008083d2a' (block: false, force: true) [2024-05-05 12:02:19] info: z2m: Successfully removed device '0x00158d0008083d2a' (block: false, force: true)

  • changed the config to "adapter: ember" and "rtscts: false" and restarted zigbee2mqtt

[12:06:41] INFO: Preparing to start... [12:06:42] INFO: Socat not enabled [12:06:48] INFO: Starting Zigbee2MQTT... [2024-05-05 12:06:53] info: z2m: Logging to console, file (filename: log.log) [2024-05-05 12:06:58] info: z2m: Starting Zigbee2MQTT version 1.37.0 (commit #unknown) [2024-05-05 12:06:58] info: z2m: Starting zigbee-herdsman (0.45.0) [2024-05-05 12:06:59] info: zh:ember: ======== Ember Adapter Starting ======== [2024-05-05 12:06:59] info: zh:ember:ezsp: ======== EZSP starting ======== [2024-05-05 12:06:59] info: zh:ember:uart:ash: ======== ASH NCP reset ======== [2024-05-05 12:06:59] info: zh:ember:uart:ash: Socket ready [2024-05-05 12:06:59] info: zh:ember:uart:ash: ======== ASH starting ======== [2024-05-05 12:07:00] info: zh:ember:uart:ash: ======== ASH connected ======== [2024-05-05 12:07:00] info: zh:ember:uart:ash: ======== ASH started ======== [2024-05-05 12:07:00] info: zh:ember:ezsp: ======== EZSP started ======== [2024-05-05 12:07:00] warning: zh:ember: [EzspConfigId] Failed to SET "ADDRESS_TABLE_SIZE" TO "16" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 12:07:00] warning: zh:ember: [EzspConfigId] Failed to SET "APS_UNICAST_MESSAGE_COUNT" TO "32" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 12:07:00] warning: zh:ember: [EzspConfigId] Failed to SET "NEIGHBOR_TABLE_SIZE" TO "26" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 12:07:00] warning: zh:ember: [EzspConfigId] Failed to SET "SOURCE_ROUTE_TABLE_SIZE" TO "200" with status=ERROR_INVALID_VALUE. Firmware value will be used instead. [2024-05-05 12:07:00] warning: zh:ember: [EzspConfigId] Failed to SET "MULTICAST_TABLE_SIZE" TO "16" with status=ERROR_OUT_OF_MEMORY. Firmware value will be used instead. [2024-05-05 12:07:00] info: zh:ember: [STACK STATUS] Network up. [2024-05-05 12:07:00] info: zh:ember: [INIT TC] NCP network matches config. [2024-05-05 12:07:00] info: zh:ember: [CONCENTRATOR] Started source route discovery. 1248ms until next broadcast. [2024-05-05 12:07:01] info: z2m: zigbee-herdsman started (resumed) [2024-05-05 12:07:01] info: z2m: Coordinator firmware version: '{"meta":{"build":0,"ezsp":13,"major":7,"minor":4,"patch":1,"revision":"7.4.1 [GA]","special":0,"type":170},"type":"EmberZNet"}' [2024-05-05 12:07:01] info: z2m: Currently 0 devices are joined: [2024-05-05 12:07:01] info: z2m: Zigbee: disabling joining new devices. [2024-05-05 12:07:01] info: z2m: Connecting to MQTT server at mqtt://core-mosquitto:1883 [2024-05-05 12:07:01] info: z2m: Connected to MQTT server [2024-05-05 12:07:02] info: z2m: Started frontend on port 8099 [2024-05-05 12:07:02] info: z2m: Zigbee2MQTT started!

  • when I try to pair the same aqara motion sensor...

[2024-05-05 12:07:40] info: z2m: Zigbee: allowing new devices to join. [2024-05-05 12:07:40] info: zh:ember: [STACK STATUS] Network opened. [2024-05-05 12:08:08] info: zh:controller: Interview for '0x00158d0008083d2a' started [2024-05-05 12:08:08] info: z2m: Device '0x00158d0008083d2a' joined [2024-05-05 12:08:09] info: z2m: Starting interview of '0x00158d0008083d2a' [2024-05-05 12:08:11] warning: zh:ember: [ZDO] Node descriptor for "7769" reports device is only compliant to revision "pre-21" of the ZigBee specification (current revision: 23). [2024-05-05 12:08:47] info: zh:controller: Succesfully interviewed '0x00158d0008083d2a' [2024-05-05 12:08:47] info: z2m: Successfully interviewed '0x00158d0008083d2a', device has successfully been paired [2024-05-05 12:08:47] info: z2m: Device '0x00158d0008083d2a' is supported, identified as: Aqara Motion sensor (RTCGQ11LM) [2024-05-05 12:08:47] info: z2m: Configuring '0x00158d0008083d2a' [2024-05-05 12:08:47] info: z2m: Successfully configured '0x00158d0008083d2a'

so pairing is working and I didn't get the broadcast error now, not while starting up and not while pairing.

So starting over with zigbee2mqtt solved it for me, but that is not possible for everyone I think :-)

alainsch avatar May 05 '24 10:05 alainsch

so pairing is working and I didn't get the broadcast error now, not while starting up and not while pairing.

So starting over with zigbee2mqtt solved it for me, but that is not possible for everyone I think :-)

No, not completly... after approx 5 minutes, pairing was again not possible. No errors, but the connection / interview didn't start. Tried to restart z2m and reboot the coordinator, nothing helps.

Downgraded the coordinator to the 20231030 FW (ESZP12) and switched back to "adapter: ezsp" and I still got the "error: zh:controller:greenpower: Received undefined command from '0' " messages, but pairing is possible again.

Will see in about 10 minutes...

alainsch avatar May 05 '24 10:05 alainsch

Very very simple configuration here.

HAOS on qemu VM in low end x86-64 QNAP nas, resources 2 cpu+2 GB ram as suggested by HAOS setup guide. I have seen a lot of ppl using VMs or arm devices: one common point may be low resources in terms of CPU power and/or RAM.

Back to the setup, I can report two setups:

  1. ZBDongle-E with fw 7.4.2, Z2M 1.37.0, ember driver. Only the ZBDongle-E is in the ZigBee network so it is only the coordinator. The broadcast errors happens. This may rule out the devices and spot the light on the coordinator.
  2. ZBDongle-E as above in above setup but with 2 Sonoff TRVZB valves added to the ZigBee network: same error continues to happen. But since it was happening with the coordinator alone as for setup 1, I would rule out the fact that I have added the 2 devices.

Anyway I see from other posts that the error is happening with a variety of devices and if I look at another common factor, all the variety of networks showing the error have -> a coordinator <- which again spots the light on the coordinator.

I see that @Nerivec is not able to reproduce the issue, and, needless to say, also Nerivec is working with a coordinator which should obviously rule out the coordinator itself (unless there is some elusive coordinator hardware common factor), maybe a good starting point for you would be to constrain the system on a low resource/slow host or a VM with limited resources to see what happens with the coordinator handling of Z2M.

Maybe another hint maybe found in the first post from @julien-billaud: "I've tried the exact same configuration on a regular x86 computer running debian (using the same zigbee dongle) and didn't face any issue which seems to be a linked with the Raspberry pi 4".

I do also have one Sonoff TRVZB.

And I also started fresh with one new zigbee2mqtt config and just the coordinator, and even at start the pairing/broadcast issue appeared immediately. I don't think that it is an issue with raspberry pi as I am using an x86 machine running a zigbee2mqtt container (docker).

I also observed that a coordinator reset sometimes helped. @Nerivec recommended to do a hard reset with my device (that includes pushing the physical reset button). This also helped me once starting without any issues, but after restarting again, I again suffered by those errors.

fir3drag0n avatar May 05 '24 11:05 fir3drag0n

HAOS on qemu VM in low end x86-64 QNAP nas, resources 2 cpu+2 GB ram as suggested by HAOS setup guide. I have seen a lot of ppl using VMs or arm devices: one common point may be low resources in terms of CPU power and/or RAM.

I don't think that it is an issue with raspberry pi as I am using an x86 machine running a zigbee2mqtt container (docker).

Just to have a better understanding: what CPU/RAM is your x86 machine? Is it running what OS? Is it on bare metal or on a virtualization environment like Proxmox or other VM of any sort? I agree dockers are less demanding, but performance then is limited by the host so it would be useful to know what kind of host is running your docker and how loaded is your x86 system.

Ricc68 avatar May 05 '24 12:05 Ricc68

It is a Intelยฎ Coreโ„ข i3-9100 system with 64 GB RAM ECC. It is running Unraid / NAS system with virtualization options (docker or vms).

fir3drag0n avatar May 05 '24 12:05 fir3drag0n

I have a low-resource VM that mimics the specs of an average PI 4 to run tests on stuff that I know affect performance. No issue there either. No failed broadcast without any device, nor with devices, and successfully paired & re-paired a dozen devices since it's been running for a couple of hours.

But just in case, you can try giving it some breathing room with the adapter_delay setting:

advanced:
  adapter_delay: 20

Default/min is 5, max is 60 (milliseconds). Note that at 60, you are likely to experience some delays when triggering devices rapidly.


PS: I created an issue in the firmware repo for the SLZB-06M and the failing config IDs. May or may not be related to the ensuing troubles, but we need to get to the bottom of it nonetheless. https://github.com/darkxst/silabs-firmware-builder/issues/90

Nerivec avatar May 05 '24 12:05 Nerivec

adapter_delay: 20

Added the adapter_delay option, no joy:

[2024-05-05 14:42:54] error: zh:ember: Delivery of BROADCAST failed for "65532" [apsFrame={"profileId":0,"clusterId":54,"sourceEndpoint":0,"destinationEndpoint":0,"options":256,"groupId":0,"sequence":170} messageTag=255] [2024-05-05 14:42:55] error: zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":41440,"clusterId":33,"sourceEndpoint":242,"destinationEndpoint":242,"options":256,"groupId":0,"sequence":171} messageTag=1] [2024-05-05 14:42:57] error: zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":0,"clusterId":19,"sourceEndpoint":0,"destinationEndpoint":0,"options":1024,"groupId":0,"sequence":53} messageTag=255] [2024-05-05 14:44:07] error: zh:ember: Delivery of BROADCAST failed for "65533" [apsFrame={"profileId":0,"clusterId":19,"sourceEndpoint":0,"destinationEndpoint":0,"options":1024,"groupId":0,"sequence":59} messageTag=255]

at startup of z2m.

Ricc68 avatar May 05 '24 12:05 Ricc68

I've been doing little more testing and figured out "what was wrong". I've done the following tests : Start on a fresh install for the pi4 and install the latest version of docker, all from an SD card (removing de SSD plugged on the USB3 port) only remaining plugged, the dongle-e on the second USB3 port. Averything has been running perfectly fine with the ember driver. From that fresh install, I then plugged the SSD on the USB3 port then it started to be way less responsive so I've rebooted the system and got the exact same "BROADCAST" errors and nothing was working. Then, I've switched the dongle-e to one of the USB2.0 port and kept the SSD to one of the USB3 port then no more error. last test, starting the PI4 from the SSD plugged to USB3.0 then the Dongle-e to USB2.0 and now everything is working fine with ember driver.

To conclude, it seems like the ember driver is for some reason little bit more sensitive (I know that using the Dongle without extension cord isn't ideal). Hope it will help for those who are observing the same "BROADCAST" error after switching from ezsp to ember driver and/or what in that driver is leading to that strange behavior.

julien-billaud avatar May 05 '24 13:05 julien-billaud

Can't be my problem. USB2 Port with 2m extension cable.

supaeasy avatar May 05 '24 13:05 supaeasy

I also have a remote device which has no influence by USB.

fir3drag0n avatar May 05 '24 13:05 fir3drag0n

To conclude, it seems like the ember driver is for some reason little bit more sensitive (I know that using the Dongle without extension cord isn't ideal). Hope it will help for those who are observing the same "BROADCAST" error after switching from ezsp to ember driver and/or what in that driver is leading to that strange behavior.

Can't be my problem as well: here USB3 port, no USB2 ports available on NAS, but 1m USB2 extension cable makes it irrelevant.

But look, there might be an interaction or a common factor highlighted by your case: internal USB hub activity (USB3 ports all on the same hub? You know that you can have multiple USB ports but if they are all headed to a single hub the bandwidth is shared and I guess the SSD is draining a lot of it)/disk activity, and again maybe this remands to low resources.

Just to be straight: I'm not believing so much, or only to, the low resources hypothesis, it's only that it is kind of a clear common factor here, but I don't want to bring investigation to a possibly false route.

In addition, I was reasoning about the broadcast error itself. It's a broadcast, it's a message that the coordinator sends over to the ZigBee network. In general terms, and if I understand what a broadcast is in ZigBee terms, the broadcast message is initiated by the driver, sent over the wire to the coordinator firmware and finally the firmware sends it to the radio. It's not something coming in, it's something going out, and it should not necessarily expect an answer (think of a ZigBee network composed only by the coordinator). I tried to set the adapter_delay to 60 milliseconds and the error happens: this makes me think it's not a matter of timing of sending commands to the coordinator firmware but sending this specific broadcast command raises the error. If the firmware is able to send broadcasts over the radio, and this is confirmed by the fact that the ezsp driver don't show the issue, then it must be something with the ember driver or packaging the command or sending the command over the serial wire. Packaging the command should not be the issue because @Nerivec is not able to reproduce the issue, so should we think it's something related to sending the command over the wire?

Another interesting question: are all the broadcast commands failing or only some of them? Answering this may help posing another question: if only some of the broadcast commands are failing, what's the difference between a good broadcast command and a failed broadcast command?

Would sniffing the serial port help understanding something about these errors?

Ricc68 avatar May 05 '24 13:05 Ricc68