pyvlx icon indicating copy to clipboard operation
pyvlx copied to clipboard

Keeping connection open freezes KLF

Open werner-j opened this issue 5 years ago • 22 comments

Hello,

I have my connection with KLF 200 open for an infinite time (it should react any time I issue a command). However, within usually 24 hours I lose the ability to talk to KLF (no reaction) and when trying to establish a connection I receive:

Connecting to KLF 200.
Traceback (most recent call last):
  File "/home/velux/klf200control/vlxcontrol/vlxcontrol.py", line 152, in <module>
    LOOP.run_until_complete(init_pyvlx_connection(LOOP))
  File "/usr/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "/home/velux/klf200control/vlxcontrol/vlxcontrol.py", line 26, in init_pyvlx_connection
    await pyvlx.load_nodes()
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/pyvlx.py", line 83, in load_nodes
    await self.nodes.load(node_id)
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/nodes.py", line 70, in load
    await self._load_all_nodes()
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/nodes.py", line 86, in _load_all_nodes
    await get_all_nodes_information.do_api_call()
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/api_event.py", line 22, in do_api_call
    await self.send_frame()
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/api_event.py", line 34, in send_frame
    await self.pyvlx.send_frame(self.request_frame())
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/pyvlx.py", line 70, in send_frame
    await self.connect()
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/pyvlx.py", line 45, in connect
    await self.connection.connect()
  File "/home/velux/.local/lib/python3.7/site-packages/pyvlx/connection.py", line 89, in connect
    ssl=self.create_ssl_context())
  File "/usr/lib/python3.7/asyncio/base_events.py", line 986, in create_connection
    ssl_handshake_timeout=ssl_handshake_timeout)
  File "/usr/lib/python3.7/asyncio/base_events.py", line 1014, in _create_connection_transport
    await waiter
ConnectionAbortedError: SSL handshake is taking longer than 60.0 seconds: aborting the connection

The only thing that helps is to unplug the KLF200 from the power source and replug it and have it fresh started. Any idea what the reason for this could be / any way to work around this?

werner-j avatar Jul 16 '19 08:07 werner-j

Hmm, i did not experience this. What firmware version do you have?

I guess there is not so much we can do here. Looks like a problem only velux can fix.

Julius2342 avatar Jul 16 '19 09:07 Julius2342

Werner, I had the same symptoms.

These two fixes have solved the root causes for me: https://github.com/home-assistant/home-assistant/issues/23748 https://github.com/Julius2342/pyvlx/issues/25

@Julius2342 would you be so kind to release a new version of pyvlx including these fixes?

madzrobz avatar Jul 27 '19 07:07 madzrobz

Done. https://github.com/Julius2342/pyvlx/releases/tag/0.2.12

Julius2342 avatar Aug 02 '19 15:08 Julius2342

I think we can close this issue. Please reopen if problem still exists.

Julius2342 avatar Dec 23 '19 11:12 Julius2342

@Julius2342 Unfortunately I still see this issue on 0.2.12.

apeeters avatar Apr 15 '20 15:04 apeeters

do you see anything in the logs?

Julius2342 avatar Apr 16 '20 10:04 Julius2342

Nothing but this stacktrace.

The connection seems to survive a few restarts of Home Assistant, but at some point I get the following stacktrace and the only way to recover is power cycling the KLF and restarting Home Assistant.

2020-04-19 22:14:02 WARNING (MainThread) [pyvlx] Connecting to KLF 200.
2020-04-19 22:14:12 WARNING (MainThread) [homeassistant.setup] Setup of velux is taking over 10 seconds.
2020-04-19 22:15:02 WARNING (MainThread) [pyvlx] Connecting to KLF 200.
2020-04-19 22:15:03 ERROR (MainThread) [homeassistant.setup] Error during setup of component velux
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/setup.py", line 171, in _async_setup_component
    hass, processed_config
  File "/usr/src/homeassistant/homeassistant/components/velux/__init__.py", line 31, in async_setup
    await hass.data[DATA_VELUX].async_start()
  File "/usr/src/homeassistant/homeassistant/components/velux/__init__.py", line 69, in async_start
    await self.pyvlx.load_scenes()
  File "/usr/local/lib/python3.7/site-packages/pyvlx/pyvlx.py", line 87, in load_scenes
    await self.scenes.load()
  File "/usr/local/lib/python3.7/site-packages/pyvlx/scenes.py", line 51, in load
    await get_scene_list.do_api_call()
  File "/usr/local/lib/python3.7/site-packages/pyvlx/api_event.py", line 22, in do_api_call
    await self.send_frame()
  File "/usr/local/lib/python3.7/site-packages/pyvlx/api_event.py", line 34, in send_frame
    await self.pyvlx.send_frame(self.request_frame())
  File "/usr/local/lib/python3.7/site-packages/pyvlx/pyvlx.py", line 70, in send_frame
    await self.connect()
  File "/usr/local/lib/python3.7/site-packages/pyvlx/pyvlx.py", line 45, in connect
    await self.connection.connect()
  File "/usr/local/lib/python3.7/site-packages/pyvlx/connection.py", line 89, in connect
    ssl=self.create_ssl_context())
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 989, in create_connection
    ssl_handshake_timeout=ssl_handshake_timeout)
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 1017, in _create_connection_transport
    await waiter
ConnectionAbortedError: SSL handshake is taking longer than 60.0 seconds: aborting the connection

apeeters avatar Apr 22 '20 08:04 apeeters

Hmm ... no idea how to add a timeout to creating a connection ...

Julius2342 avatar Apr 26 '20 15:04 Julius2342

My velux connection also fails after every few reboots of Home Assistant

rohrsh avatar May 04 '20 02:05 rohrsh

I had a deeper look inside the issue, the problem is that disconnecting the SSL-Connecton is not async: https://github.com/Julius2342/pyvlx/blob/master/pyvlx/connection.py#L76

It looks like the shutdown of Hass became quicker recently, so the shutting down of the SSL connections is not awaited (not in the await meaning).

Therefore the connections on the KLF200 side are not closed down correctly and after several reconnects the KLF200 refused to take new connections.

Julius2342 avatar May 05 '20 15:05 Julius2342

Any news on this issue? Still seeing this behaviour on home assistant 0.110.1. Available to test and provide logs if helpful.

gaggio avatar Jun 24 '20 09:06 gaggio

Also having this issue pretty consistently and can do the same ^

AlecRust avatar Jun 24 '20 09:06 AlecRust

I also can confirm that KLF freezes. Actually after each HA reboot I have to restart the KLF.

pawlizio avatar Jun 24 '20 11:06 pawlizio

As said, i have no idea how to mitigate this and i need help:

  • The problem is that the connection within KLF 200 s not cleared up correctly. The device freezes and you cant connect.
  • There is no disconnect command within KLF 200.
  • The logic here was not changed within a year.

The only suspicion i have is that the SSL connection is not disconnected properly, bc the process ends before everythng is cleared up correctly. But this is just a suspicion. I have no idea what was changed within HASS and what could cause this problem.

Julius2342 avatar Jun 24 '20 11:06 Julius2342

I read in a FHEM forum that they automatically reboot the KLF via API if they recognized any connection problems (I don't know how, maybe they count them somehow) to avoid a manual reboot. This would support your suspicion.

I'm not familiar with all this IP sockets and event loops, but when I tried today to build up several new connections without closing them properly I could not identify any connection problems. I used the demo files from https://www.velux.com/klf200 to perform some tests.

Also I tried reproduce the "SSL handshake takes too long" issue by restarting HA several times but I could not identify any connection problems. However always if I update my HA, I have to reboot the KLF. Not sure what exactly is causing this issue.

pawlizio avatar Jun 24 '20 20:06 pawlizio

@pawlizio : How can you reboot via API if you cant connect? 🤔

Julius2342 avatar Jun 25 '20 07:06 Julius2342

Of course you have to reboot in advance, not if you don't have any possibility to connect.

What I understand is that 2 sockets can be established at maximum with the KLF and that KLF does not close them properly after 15 minutes without communication, as written in their API description. So your suspicion is that KLF freezes or becomes irresponsive if you use both sockets without closing them properly,

Now a possible solution is that if pyvlx establishes a connection for the first time an automatic reboot of the KLF could be initiated. In this way you ensure that you always have the 2nd socket if you loose your connection. Just count the reconnects within pyvlx and on each 2nd connection force a reboot (after 1st connection, after 3rd connection and so on).

pawlizio avatar Jun 25 '20 08:06 pawlizio

I am still experiencing this issue too. (My current plan is to put a WeMo on the Velux power supply so I can restart it each night!)

Incidentally my Velux has two clients - HomeAssistant and a Savant host - so that might explain why I’m hitting any connection limit so quickly and often.

On 25 Jun 2020, at 5:55 pm, Paul Daumlechner [email protected] wrote:

 Of course you have to reboot in advance, not if you don't have any possibility to connect.

What I understand is that 2 sockets can be established at maximum with the KLF and that KLF does not close them properly after 15 minutes without communication, as written in their API description. So your suspicion is that KLF freezes or becomes irresponsive if you use both sockets without closing them properly,

Now a possible solution is that if pyvlx establishes a connection for the first time an automatic reboot of the KLF could be initiated. In this way you ensure that you always have the 2nd socket if you loose your connection. Just count the reconnects within pyvlx and on each 2nd connection force a reboot (after 1st connection, after 3rd connection and so on).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

rohrsh avatar Jun 25 '20 09:06 rohrsh

I'm experiencing this issue, mainly when I updated HA. I'm powering the KLF via one of the Raspberry Pi USB port, so in theory I could power the USB off to force a reboot but it's probably complicated to do with HassOS (plus I have other devices on the other USB ports).

abey79 avatar Jun 26 '20 14:06 abey79

I created a pull request #43 for testing purpose. I implemented the same in my custom component today. As I could not reproduce the freeze of KLF systematically I only can wait and hope that there is an improvement on this issue.

pawlizio avatar Jun 26 '20 17:06 pawlizio

@Julius2342: A few days ago I saw this velux API implementation and just now had an idea.

There is an explanation under which condition the TLS does not work: "If there is no communication with the KLF every 10 minutes to 15 minutes, the connection will be disconnected as described in the manual. If this happens when the home monitor "GW_HOUSE_STATUS_MONITOR_ENABLE_REQ" is activated, the KLF200 is no longer reachable. The KLF200 no longer sends the TLS command "Change Cipher Spec."

As a consequent in order to avoid unresponsive communication, we should try to deactivate house status monitor before closing any connection. May this could solve most of the problems during HA restarts.

pawlizio avatar Mar 14 '21 11:03 pawlizio

Hello! I've got similar issue: https://github.com/home-assistant/core/issues/48182

Enrico

enricozocca avatar Apr 04 '21 09:04 enricozocca