libzwaveip icon indicating copy to clipboard operation
libzwaveip copied to clipboard

Listener issue...

Open aromanro opened this issue 7 years ago • 2 comments

For now I don't know why it's happening, but I'll try to explain what seems to be wrong.

I have several devices, with a couple of them I can reproduce it, for example a wall controller (four buttons) and a thermostat.

Pressing a button on the wall controller makes the listener receive a scene notification. It displays something like this:

Thread 7fd67b574700: accepted connection from 192.168.123.210:41230 Thread 7fd67b574700: wrote 7 bytes 5B039D0001

cmd_class: COMMAND_CLASS_CENTRAL_SCENE v3 cmd: CENTRAL_SCENE_NOTIFICATION Sequence Number > 157 Properties1 > Key Attributes Key Pressed 1 time: Reserved:: 00 Slow Refresh : false

Scene Number > 001 bytestream: 5b 03 9d 00 01

Thread 7fd67b574700: read 17 bytes

With tcpdump I can see two packets, like this:

13:44:43.794128 IP 192.168.123.210.41230 > adrian-X7DWA.41231: UDP, length 77 13:44:43.794177 IP adrian-X7DWA.41231 > 192.168.123.210.41230: UDP, length 61

If I press a button on the thermostat, the listener displays something like this:

Connected to 192.168.123.230

Thread 7fd67b574700: accepted connection from 192.168.123.230:41230 Thread 7fd67b574700: wrote 7 bytes 8F01030380032F06310501420A87028407

cmd_class: COMMAND_CLASS_MULTI_CMD v1 cmd: MULTI_CMD_ENCAP

and so on... I don't think it's relevant, but the problem is that while the connection with the thermostat is active, a press on the button of the wall controller is not received by the listener. I can see with tcpdump the 77 bytes packets being sent, but not the 61 bytes ones:

13:48:27.982357 IP 192.168.123.210.41230 > adrian-X7DWA.41231: UDP, length 77 13:48:29.265798 IP 192.168.123.210.41230 > adrian-X7DWA.41231: UDP, length 77 13:48:30.370554 IP 192.168.123.210.41230 > adrian-X7DWA.41231: UDP, length 77 13:48:30.626107 IP 192.168.123.230.41230 > adrian-X7DWA.41231: UDP, length 77

The log of the zipgateway (the March binary) contains something like this:

When it works:

257262814 DTLS for Classic node 257263490 ApplicationIpCommandHandler: Package is too small. 257265221 ApplicationCommandHandler 27->1 class 0x98 cmd 0x40 size 2 257265287 ApplicationCommandHandler 27->1 class 0x98 cmd 0x81 size 29 257265287 Authentication verified 257265288 ApplicationCommandHandler 27->1 class 0x60 cmd 0x0d size 9 257265289 ApplicationCommandHandler 27->1 class 0x5b cmd 0x03 size 5 257265290 Unhandled command 0x5b:0x03 from fd00:55fc:b136:d42::1b 257265294 257265295 Sending Unsolicited to IP app... 257265296 Sending Unsolicited2 to IP app... 257265297 Sending first attempt 257265298 Frame has no payload 257265299 queue_send_done to node 27 queue 1 257265299 DTLS for Classic node 257265348 Session timeout 0x17ab160 257265349 Closing DTLS connection

When it doesn't work (a connection already existing due of the thermostat key press):

257332607 ApplicationCommandHandler 27->1 class 0x98 cmd 0x40 size 2 257332673 ApplicationCommandHandler 27->1 class 0x98 cmd 0x81 size 29 257332674 Authentication verified 257332675 ApplicationCommandHandler 27->1 class 0x60 cmd 0x0d size 9 257332675 ApplicationCommandHandler 27->1 class 0x5b cmd 0x03 size 5 257332676 Unhandled command 0x5b:0x03 from fd00:55fc:b136:d42::1b 257332680 257332681 Sending Unsolicited to IP app... 257332682 New client session allocated 257332683 Sending Unsolicited2 to IP app... 257332684 New client session allocated 257333182 GetCacheEntryFlag: on non existing node=0 257333183 Sending with scheme 255 257335685 Session timeout 0x17c1c40 257335686 Closing DTLS connection 257335687 Session timeout 0x17e1d28 257335688 Closing DTLS connection

I'll try to figure out more, if I'll find something new I'll add it here.

aromanro avatar Mar 27 '17 10:03 aromanro

More info: Today when I attempted to reproduce it again (after restarting the zip gateway), it worked all right for the above mentioned devices, but the problem occurred for another one.

Each time it's happening I see the GetCacheEntryFlag: on non existing node=0 log message.

aromanro avatar Mar 28 '17 08:03 aromanro

I have more info on it and a possible fix:

First I changed a timeout from 100 ms to 10 ms (see here: https://github.com/Z-WavePublic/libzwaveip/blob/master/libzwaveip/libzwaveip.c#L476 ) for listener connections only (checking is_client). This change made it behave much better, which offered a hint for the solution.

Currently in my code I have it set to 30 ms, but the solution I picked was to expose max_timeouts to be able to change it in the equivalent of zconnection_recv_raw and set it to zero to end the loop if a packet is received (alternatively, use is_running). Basically I signal termination after calling recv here - only for a listener loop, of course, so is_client is checked: https://github.com/Z-WavePublic/libzwaveip/blob/master/libzwaveip/zconnection.c#L176

Sorry for the cryptic description, I changed the code into some C++ code I wrote which is quite different now than this one...

After this change I seem to receive all (most?) of notifications.

aromanro avatar Jun 29 '17 10:06 aromanro