MySensors icon indicating copy to clipboard operation
MySensors copied to clipboard

MySensors entities are not properly set when activating scenes/automations in Home Assistant

Open feanor-anglin opened this issue 3 years ago • 43 comments

I have the same problem as described here: https://github.com/home-assistant/core/issues/59388

When sending multiple messages at the same time from HA to MySensors nodes I frequently experience some messages remain undelivered. The author of the issue I've linked to has managed to identify the problem is on MySensors side, but haven't provided any details. In my opinion this is a really serious disadvantage, probably the most important I experience using MySensors in my project. Can we do something about it? :)

Additional details: I use Ethernet Gateway and RS485; the problem is also present when using Serial Gateway and wireless communication.

feanor-anglin avatar Jul 08 '22 10:07 feanor-anglin

See also issue #1512.

@feanor-anglin Hi, do you also use signing? And do you use NRF24 radios for wireless?

virtual-maker avatar Jul 08 '22 16:07 virtual-maker

No, I don't use signing, since I use RS485 wired communication. The guy from home assistant issue said he used wireless.

feanor-anglin avatar Jul 08 '22 17:07 feanor-anglin

I assume it has to do with RS485. Can you provide debug logs from gateway and node? I think best is to start a discussion in MySensors forum for this.

virtual-maker avatar Jul 08 '22 17:07 virtual-maker

Unfortunately, I can't provide debug from gateway, since there is too little memory to enable it. But, the author of the issue I've linked has provided some.

2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 6;0;1;1;41;ffff60b8                                                                                                                           
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 7;0;1;1;2;1                                                                                                                                   
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 7;0;1;1;3;89
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Sending 7;0;1;1;41;ffff00d6
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736274 TSF:MSG:SEND,0-0-6-6,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736274 TSF:MSG:SEND,0-0-6-6,s=0,c=3,t=16,pt=0,l=0,sg=1,ft=0,st=OK:
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736282 !MCO:PRO:RC=1
2021-11-08 18:56:31 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736282 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736321 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736321 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736328 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736328 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736332 TSF:MSG:READ,6-6-0,s=255,c=3,t=17,pt=6,l=25,sg=0:<NONCE>
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736332 TSF:MSG:READ,6-6-0,s=255,c=3,t=17,pt=6,l=25,sg=0:<NONCE>
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736339 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736339 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736378 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736378 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736385 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736385 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736389 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736389 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736428 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736428 !TSF:MSG:SEND,0-0-7-7,s=0,c=3,t=16,pt=0,l=0,sg=0,ft=0,st=NACK:
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736435 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736435 !TSF:MSG:SIGN FAIL
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736439 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107736439 !MCO:PRO:RC=1
2021-11-08 18:56:32 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107736442 !MCO:PRO:RC=1
...
2021-11-08 18:56:36 DEBUG (MainThread) [mysensors.transport] Receiving 0;255;3;0;9;1107741282 !TSF:MSG:SIGN FAIL                                                                                                  
2021-11-08 18:56:36 DEBUG (MainThread) [mysensors.handler] n:0 c:255 t:3 s:9 p:1107741282 !TSF:MSG:SIGN FAIL

feanor-anglin avatar Jul 08 '22 21:07 feanor-anglin

Those messages show signing failures, so the cause of those messages is probably not related to your setup.

mfalkvidd avatar Jul 09 '22 07:07 mfalkvidd

Ok, so it seems there are 2 kinds of problems with sending messages with short time interval: signing problem and RS485 problem. Shortly I will try to check if the issue is also applicable to wireless setup with no signing and RS485 setup under domoticz (serial and Ethernet).

Do you guys think we should move this discussion to mysensors forum?

feanor-anglin avatar Jul 10 '22 12:07 feanor-anglin

Github is better if the problem is related to the library code. The forum is better for general troubleshooting, as there are many more people with practical experience of MySensors in the forum.

Since we're really at the troubleshooting stage, we might get better help in the forum. Hopefully someone with a RS485 setup can chime in.

mfalkvidd avatar Jul 16 '22 11:07 mfalkvidd

I've debugged exact same problem, I did not use signing. I've used PJON with gateway-node communication, but it's not on that layer IMHO. I could reproduce the issue even on the serial gateway without any node too.

You can simply flood the gateway with a lot of messages, and only part of the messages gets executed and the others are lost. That's why in many places you recommend using wait() to not flood the gateway. Unfortunately, the Home Assistant sends messages in async nearly at the same time.

In comparison, I've noticed that OpenHab configuration has the sendDelay property https://github.com/tobof/openhab-addons/wiki/Configuration#configuring-the-gateway, but didn't manage to test it.

Does the gateway implement any buffer for received messages?

dgertych-monterail avatar Sep 21 '22 07:09 dgertych-monterail

@chytreg I came to the same conclusion about this issue, very nice to hear you support it. The sendDelay you are talking about seems very nice workaround. It should be proposed to HA-MySensors integration devs.

feanor-anglin avatar Sep 21 '22 10:09 feanor-anglin

@feanor-anglin Looks like Domoticz implements sendDelay too.

As for HomeAssistant they are using pymysensors with the asyncio. I'm not sure if this send method is blocking or not, but it's worth to try put some sleep there and check.

dgertych-monterail avatar Sep 21 '22 13:09 dgertych-monterail

@chytreg I've reported this issue to HA and included your findings. Hope something will be finally done with this.

feanor-anglin avatar Sep 24 '22 12:09 feanor-anglin

It seems HA dev requires the sendDelay parameter to be put into the official MySensors documentation. @mfalkvidd, are you the right person to ask for it?

feanor-anglin avatar Sep 25 '22 19:09 feanor-anglin

@feanor-anglin sorry, I don't understand. MySensors does not have anything called sendDelay. According to a comment above, openHab does have such a property. But would it really make sense for MySensors to document a property that exists in openHab?

mfalkvidd avatar Sep 25 '22 20:09 mfalkvidd

I did some tests, with HA and Domowicz here are my findings, https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031 My setup: ESP8266 as TCP Gateway + 1 Node on Arduino Mega, with 16 relays, I use PJON.

You can simply flood the gateway with a lot of messages, and only part of the messages gets executed and the others are lost. That's why in many places you recommend using wait() to not flood the gateway. Unfortunately, the Home Assistant sends messages in async nearly at the same time

It's not about flooding the gateway, it's about that node requires some time to process the messages. I think it's not even a HA vs Domowicz thing. In my simple node sketch when I remove wait(100) I receive a lot of NACK on node part.

The same happens in HA, because it sends all the command in the same time to the Node. Domoticz on the other hand send command in the queue with some delay so the Node has time to process them.

There is one thing I don't understand why I need to use wait(100) is it hardware limitation or software one? Thus I believe Domowicz / OpenHub implemented send in synchronous / queued way. cc @mfalkvidd

I did some recordings: Domoticz: Turn ON/OFF all relays => NO NACKs on Node

https://user-images.githubusercontent.com/142751/192165428-d4903f7b-7c75-45a8-af95-cec435090c83.mov

HA: Turn ON all relays => NACKs on Node

https://user-images.githubusercontent.com/142751/192165484-3e5c6f1c-d8a8-4908-97ef-f31ba99c5c06.mov

dgertych-monterail avatar Sep 25 '22 21:09 dgertych-monterail

@MartinHjelmare Could you take a look, I'm trying to understand where is the problem, is it hardware limitation of message processing or it's software issue with MySensors. Why other controllers like Domoticz or OpenHub implemented communication with MySensors a bit different than HA.

dgertych-monterail avatar Sep 25 '22 21:09 dgertych-monterail

@mfalkvidd, @feanor-anglin Hello, I'm also completely unclear who @mfalkvidd is supposed to ask for what! @feanor-anglin a little more explanation would be helpful.

I did some googling and found a parameter send_delay in the Pilight integration of HA. I think from HA point of view Pilight is a similar usecase as MySensors.

HA Pilight send_delay @feanor-anglin Do you mean something like this?

virtual-maker avatar Sep 25 '22 21:09 virtual-maker

@virtual-maker Smth like that, could solve the problem. Domoticz and OpenHub, implemented it that way. My main question is why it's even needed? Why I have to wait(100) between send messages to avoid NACKs on the Node level.

dgertych-monterail avatar Sep 25 '22 21:09 dgertych-monterail

@chytreg I have no experience with PJON transport. I'm using NRF24. And there is a problem when two or more nodes send at almost the same time which produces collisions of the telegrams with NACKs for both nodes.

But what I see from your video is you have a gateway and only one node with 16 relays connected to it. Is this right?

virtual-maker avatar Sep 25 '22 21:09 virtual-maker

@virtual-maker

I have no experience with PJON transport. I'm using NRF24

Imho transport doesn't matter, I was able to reproduce the same problem on USB serial and HA which trigger the scene.

But what I see from your video is you have a gateway and only one node with 16 relays connected to it. Is this right?

Yes

And there is a problem when two or more nodes send at almost the same time which produces collisions of the telegrams with NACKs for both nodes

Hmm, imho HA sends turn_on message to all 16 relays in the same time, maybe that's the problem? Domoticz does one by one message. The messages goes nicely though gateway, but the NACKs pop up on Node https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031 all the logs in gist.

dgertych-monterail avatar Sep 25 '22 21:09 dgertych-monterail

@chytreg Thank your for your logs. This is helpful!

This is from your HA log:

22:39:31.564 > 477355 TSF:MSG:READ,0-0-3,s=14,c=1,t=2,pt=0,l=1,sg=0:1
22:39:31.570 > 477360 TSF:MSG:ECHO REQ
22:39:31.573 > 477362 !PJON:SND:FAIL
22:39:31.576 > 477365 !TSF:MSG:SEND,3-3-0-0,s=14,c=1,t=2,pt=0,l=1,sg=0,ft=4,st=NACK:1

You can see there is a separate telegram from gateway to node for each relay with additional echo request. This echo tells HA the new state of the relay (to display this actual new state in HA screen). But this echo send fails often for your setup.

I assume that the gateway try to send the next telegram for the next relay and at same time the node tries to send the echo. And I further assume that PJON transport layer is not able to handle telegram collisions but only says "NACK".

virtual-maker avatar Sep 25 '22 21:09 virtual-maker

With "handle" I mean automatically resolve the collision.

virtual-maker avatar Sep 25 '22 21:09 virtual-maker

@virtual-maker This is part of the logs when I comment out wait from https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031#file-arduino_mega_pjon_multi_realy_node-cpp-L58 & https://gist.github.com/chytreg/dcd1c7449fca25e9480c7c463d7ec031#file-arduino_mega_pjon_multi_realy_node-cpp-L27

22:28:28.877 > 5048 TSF:MSG:SEND,3-3-0-0,s=4,c=1,t=2,pt=2,l=2,sg=0,ft=0,st=OK:0
22:28:28.923 > 5083 !PJON:SND:FAIL
22:28:28.923 > 5085 !TSF:MSG:SEND,3-3-0-0,s=5,c=1,t=2,pt=2,l=2,sg=0,ft=0,st=NACK:0
22:28:28.948 > 5120 !PJON:SND:FAIL
22:28:28.948 > 5122 !TSF:MSG:SEND,3-3-0-0,s=6,c=1,t=2,pt=2,l=2,sg=0,ft=1,st=NACK:0
22:28:28.984 > 5156 TSF:MSG:SEND,3-3-0-0,s=7,c=1,t=2,pt=2,l=2,sg=0,ft=2,st=OK:0
22:28:29.022 > 5190 !PJON:SND:FAIL
22:28:29.022 > 5192 !TSF:MSG:SEND,3-3-0-0,s=8,c=1,t=2,pt=2,l=2,sg=0,ft=0,st=NACK:0
22:28:29.056 > 5227 !PJON:SND:FAIL
22:28:29.056 > 5229 !TSF:MSG:SEND,3-3-0-0,s=9,c=1,t=2,pt=2,l=2,sg=0,ft=1,st=NACK:1

In your opinion, is this also a collision? Anyway, is it possible to handle it somehow with a queue on a Gateway level, or Node level to throttle messages? (@mfalkvidd) HA does not have throttling option implemented (that's why we have collisions), I suppose Domoticz and OpenHub kinda have one, thus scene works.

dgertych-monterail avatar Sep 25 '22 22:09 dgertych-monterail

No, this is not a collision. But when you send two or more messages one after other then you shall add a wait() to give the receiver some ms to process the message. With NRF24 I use 10ms but this depends from the data transfer rate and will differ with PJON for sure. In your code (without wait()) you flood the gateway from the node side. Basically to send a telegram takes less time than the gateway is able to process it.

virtual-maker avatar Sep 25 '22 22:09 virtual-maker

@feanor-anglin sorry, I don't understand. MySensors does not have anything called sendDelay. According to a comment above, openHab does have such a property. But would it really make sense for MySensors to document a property that exists in openHab?

As we managed to determine, sendDelay exists in openHab and Domoticz and seems to provide smooth operation with MySensors, while Home Assistant usually fails when trying to send more than one message to MySensors gateway at once. My proposition was to put sendDelay (or however you call it) as a requirement to controllers into the official MySensors documentation.

feanor-anglin avatar Oct 04 '22 20:10 feanor-anglin

Hey, @mfalkvidd have you seen my answer above? Does it make sense to you now?

feanor-anglin avatar Oct 12 '22 10:10 feanor-anglin

Thanks for the reminder @feanor-anglin

Yes I have seen it, but I still have trouble understanding.

Is the sendDelay a setting configurable by the user, or is it a fixed value?

If it is a fixed value, what value do you recommend? If it is a user setting, what would the default value be (maybe 0, to not cause unnecessary problems?) How would the user determine if they need to change the setting, and how does the user know what value to set?

mfalkvidd avatar Oct 13 '22 08:10 mfalkvidd

btw, is anyone aware of any existing MySensors requirement documentation for Controllers? I think it would make sense to put this new requirement in the same place, but I don't know where that place is.

mfalkvidd avatar Oct 13 '22 08:10 mfalkvidd

@mfalkvidd these are all very good questions, since sendDelay is more like a workaround, than a real solution.

As far as I understand it (please let me know if you agree), while there is no queue implemented in gateway, the real solution should be like this:

  1. Controller sends a message to node.
  2. Gateway passes the message.
  3. Node receives the messages and send back an ack.
  4. The ack is then passed to the controller.
  5. Only after receiving an ack the controller is allowed to send another message.*

*Clearly, with a defined (or definable) timeout period.

sendDelay is a more primitive solution in which the controller waits a given period of time between following messages hoping everything works as expected.

Referring the details you asked about: sendDelay: configurable by the user in ms (it depends from many factors, e.g. transmission baud rate, how large the value should be), default may be 0 or few ms; 2 bytes should be enough I suppose, 0-65536. The same rules could apply to the timeout I described as the preferred option.

If there is no such thing as 'MySensors requirements for controllers documentation', it should certainly come to existence, giving the limitations of gateway capabilities.

feanor-anglin avatar Oct 13 '22 08:10 feanor-anglin

Yes, the acknowledgement method could be better, if MySensors supported acknowledgement. Adding support for acknowledgement has been discussed extensively, but I don't think such support will be added in the near term so the sendDelay solution is the realistic way to go.

I have created https://www.mysensors.org/controller/requirements as a placeholder for controller requirements.

mfalkvidd avatar Oct 14 '22 11:10 mfalkvidd

I don't entirely understand what you mean by "MySensors doesn't support ack". I always use ack e.g. under MYSController and nodes always send it. Maybe you mean that ack is not implemented on transport layer? But this is not really an issue here.

feanor-anglin avatar Oct 14 '22 13:10 feanor-anglin