node-red-contrib-home-assistant-websocket icon indicating copy to clipboard operation
node-red-contrib-home-assistant-websocket copied to clipboard

Intermittent "connection closed"; shows in log, does not generate an error in Node-red

Open bogorad opened this issue 1 year ago • 20 comments

Describe the bug

Suddenly I'm getting these messages in my NR log:

26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connecting to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connected to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connection closed to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connecting to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connected to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connection closed to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connecting to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connected to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connection closed to http://localhost:8123
26 Dec 20:24:58 - [info] [server:ha-bruc-pi] Connecting to http://localhost:8123

Restarting HA does not help. Restarting Node-red helps, but there's no error generated in NR, so there's no way of knowing that something is wrong.

To Reproduce

No response

Expected behavior

No response

Screenshots

No response

Example Flow

No response

Environment Information

Version:

Home Assistant version: 2023.12.3 Companion version:

Node-RED version: 3.1.3 Docker: yes Add-on: no

Node.js version: OS:

Additional context

No response

bogorad avatar Dec 27 '23 11:12 bogorad

Check Home Assistant log. Increase the log level if nothing is shown at the time of disconnect.

zachowj avatar Dec 28 '23 09:12 zachowj

Thanks for responding! I found something notable - something about "token expired":

NR log:

25 Dec 12:04:12 - [info] Started modified nodes
25 Dec 13:31:49 - [info] [server:ha-bruc-pi] Connection closed to http://localhost:8123
25 Dec 13:31:49 - [info] [server:ha-bruc-pi] Connecting to http://localhost:8123
25 Dec 13:31:49 - [info] [server:ha-bruc-pi] Connected to http://localhost:8123
25 Dec 18:06:19 - [info] [server:ha-bruc-pi] Connection closed to http://localhost:8123

so it died at 13:31:49

HA log:

2023-12-25 13:28:39.010 INFO (MainThread) [homeassistant.components.fan] Setting up fan.miio_yeelink
2023-12-25 13:28:39.011 INFO (MainThread) [custom_components.miio_yeelink] Initializing with host 192.168.11.163 (token a00e6...)
2023-12-25 13:28:44.019 ERROR (MainThread) [custom_components.miio_yeelink] Device office-mi-fan unavailable or token incorrect: Unable to discover the device 192.168.11.163
2023-12-25 13:29:12.992 INFO (SyncWorker_6) [pychromecast] Quitting current app
2023-12-25 13:29:12.992 INFO (SyncWorker_6) [pychromecast.controllers] Receiver:Stopping current app 'E8C28D3C'
2023-12-25 13:31:44.063 INFO (MainThread) [homeassistant.components.fan] Setting up fan.miio_yeelink
2023-12-25 13:31:44.067 INFO (MainThread) [custom_components.miio_yeelink] Initializing with host 192.168.11.163 (token a00e6...)
2023-12-25 13:31:49.074 ERROR (MainThread) [custom_components.miio_yeelink] Device office-mi-fan unavailable or token incorrect: Unable to discover the device 192.168.11.163
2023-12-25 13:31:49.381 INFO (MainThread) [custom_components.nodered.discovery] Updating sensor ade7372eb001b8a7 5b869741b211d1c2
2023-12-25 13:31:50.037 WARNING (Thread-5 (_listen)) [homeconnect.sseclient] Exception while reading event: 
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 81, in __next__
    next_chunk = next(self.resp_iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/homeconnect/api.py", line 168, in _listen
    for event in sse:
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 91, in __next__
    self._connect()
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 56, in _connect
    self.resp = requester.get(self.url, stream=True, **self.requests_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests_oauthlib/oauth2_session.py", line 483, in request
    url, headers, data = self._client.add_token(
                         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 217, in add_token
    raise TokenExpiredError()
oauthlib.oauth2.rfc6749.errors.TokenExpiredError: (token_expired) 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 81, in __next__
    next_chunk = next(self.resp_iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/homeconnect/api.py", line 168, in _listen
    for event in sse:
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 91, in __next__
    self._connect()
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 56, in _connect
    self.resp = requester.get(self.url, stream=True, **self.requests_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests_oauthlib/oauth2_session.py", line 483, in request
    url, headers, data = self._client.add_token(
                         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 217, in add_token
    raise TokenExpiredError()
oauthlib.oauth2.rfc6749.errors.TokenExpiredError: (token_expired) 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/homeconnect/sseclient.py", line 81, in __next__
    next_chunk = next(self.resp_iterator)
                 ^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
2023-12-25 13:31:51.051 INFO (Thread-5 (_listen)) [homeconnect.sseclient] Connecting ...
2023-12-25 13:31:51.052 INFO (Thread-5 (_listen)) [homeconnect] Token expired in event stream.
2023-12-25 13:31:51.219 INFO (Thread-5 (_listen)) [homeconnect.sseclient] Connecting ...
2023-12-25 13:31:51.294 INFO (Thread-5 (_listen)) [homeconnect] Listening to event stream for all devices
2023-12-25 13:34:49.129 INFO (MainThread) [homeassistant.components.fan] Setting up fan.miio_yeelink
2023-12-25 13:34:49.130 INFO (MainThread) [custom_components.miio_yeelink] Initializing with host 192.168.11.163 (token a00e6...)
2023-12-25 13:34:54.138 ERROR (MainThread) [custom_components.miio_yeelink] Device office-mi-fan unavailable or token incorrect: Unable to discover the device 192.168.11.163

bogorad avatar Dec 28 '23 11:12 bogorad

The expired token is from home connect, not NR. If there was an error it would have been around 25 Dec 13:31:49 or 25 Dec 18:06:19 when NR was disconnected.

Might have to increase the log level of the websocket component.

edit: Looks like the Device office-mi-fan is going offline at the same time. Are you having network issues?

zachowj avatar Dec 28 '23 13:12 zachowj

edit: Looks like the Device office-mi-fan is going offline at the same time. Are you having network issues?

nope, this device is physically disconnected (offline). but the extension seems to be constantly trying to revive it.

I'll increase the websocket log-level and wait for it to die on me again, thanks!

bogorad avatar Dec 28 '23 14:12 bogorad

any updates on this? Currently struggling with the same problem ...

pittbull avatar Dec 29 '23 21:12 pittbull

any updates on this? Currently struggling with the same problem ...

Enabled websocket logging and lying in wait. So far hasn't happened again :)

bogorad avatar Dec 29 '23 21:12 bogorad

Happens every couple of minutes here but the websocket debug log is huge ...

pittbull avatar Dec 29 '23 21:12 pittbull

Browsing through my websocket debug log on HA I find these two lines which correlates with when Nodered disconnects:

2023-12-29 23:21:02.813 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [546908513984] nodered from 172.17.0.7: Writer done
2023-12-29 23:21:02.813 DEBUG (MainThread) [homeassistant.components.websocket_api.http.connection] [546908513984] nodered from 172.17.0.7: Disconnected

pittbull avatar Dec 29 '23 22:12 pittbull

From what I've gathered so far every 120 seconds Home Assistant peaks on CPU usage (100%) on four cores (I am temporarily running on a Pi4B due to main server being repaired). When this happens Nodered disconnects and the reconnects.

I have disabled all addons and services and HA still peaks on 100% every 120 seconds.

pittbull avatar Dec 30 '23 17:12 pittbull

I am also have this issue:

2024/01/07 11:58:24,stdout,7 Jan 11:58:24 - [info] [server:Home Assistant] Connected to http://localhost:8123
2024/01/07 11:58:23,stdout,7 Jan 11:58:23 - [info] [server:Home Assistant] Connecting to http://localhost:8123
2024/01/07 11:58:23,stdout,7 Jan 11:58:23 - [info] [server:Home Assistant] Connection closed to http://localhost:8123
2024/01/07 11:58:19,stdout,7 Jan 11:58:19 - [info] [server:Home Assistant] Connected to http://localhost:8123
2024/01/07 11:58:15,stdout,7 Jan 11:58:15 - [info] [server:Home Assistant] Connecting to http://localhost:8123
2024/01/07 11:58:15,stdout,7 Jan 11:58:14 - [info] [server:Home Assistant] Connection closed to http://localhost:8123
2024/01/07 11:13:48,stdout,7 Jan 11:13:48 - [info] [server:Home Assistant] Connected to http://localhost:8123
2024/01/07 11:13:44,stdout,7 Jan 11:13:44 - [info] [server:Home Assistant] Connecting to http://localhost:8123
2024/01/07 11:13:44,stdout,7 Jan 11:13:44 - [info] [server:Home Assistant] Connection closed to http://localhost:8123
2024/01/07 10:30:09,stdout,7 Jan 10:30:09 - [info] [server:Home Assistant] Connected to http://localhost:8123
2024/01/07 10:30:06,stdout,7 Jan 10:30:06 - [info] [server:Home Assistant] Connecting to http://localhost:8123
2024/01/07 10:30:05,stdout,7 Jan 10:30:05 - [info] [server:Home Assistant] Connection closed to http://localhost:8123

JasonBSteele avatar Jan 07 '24 12:01 JasonBSteele

On my end the culprit was a timed process which ate all my cpu power (as I am temporarily running on a limited rpi), which was not interruptible which lead to NR disconnecting. I found it my enabling debug logging and carefully watching what happened every 120 sec - as the process spat out huge logs (which also contributed to "locking up" HA). I removed the offending process - temporarily - and everything is back to normal.

pittbull avatar Jan 07 '24 17:01 pittbull

I also experience this every now and then, although I have an IntelNUC with HassOS, so I don't think CPU is the issue for me... Even so, why does it never recover?

Veldkornet avatar Feb 08 '24 11:02 Veldkornet

and it's back after update to 2023.4 and node-red-contrib-home-assistant-websocket 0.63.1

maybe suggest a debugging tool that would help collect data so this can be fixed once and for all?

at the very least - escalate the error so that I can restart NR!

bogorad avatar Mar 20 '24 10:03 bogorad

In my case Whisper LOG shows my voice command “studied light on” then connection lost.

INFO:faster_whisper:Processing audio with duration 00:01.680
INFO:wyoming_faster_whisper.handler: studied light on
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:31> exception=ConnectionResetError('Connection lost')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 41, in run
    if not (await self.handle_event(event)):
  File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/handler.py", line 95, in handle_event
    await self.write_event(self.wyoming_info_event)
  File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 29, in write_event
    await async_write_event(event, self.writer)
  File "/usr/local/lib/python3.9/dist-packages/wyoming/event.py", line 131, in async_write_event
    await writer.drain()
  File "/usr/lib/python3.9/asyncio/streams.py", line 387, in drain
    await self._protocol._drain_helper()
  File "/usr/lib/python3.9/asyncio/streams.py", line 190, in _drain_helper
    raise ConnectionResetError('Connection lost')
ConnectionResetError: Connection lost

The NR Sentence node is not activated. Jarvis does not respond for a while, but I do not have any addon's Watchdog timer enabled.

Curiously the LOG for Mosquitto Broker is showing disconnections from nodered at much the same time.

2024-04-07 17:35:53: Client nodered_722316b578c76c08 disconnected.
2024-04-07 17:35:53: New connection from 192.168.1.98:52742 on port 1883.
2024-04-07 17:35:53: New client connected from 192.168.1.98:52742 as nodered_0408554e8e6f169e (p2, c1, k60, u'nodered').
2024-04-07 17:50:52: Saving in-memory database to /data//mosquitto.db.

donburch888 avatar Apr 08 '24 05:04 donburch888

This issue has been automatically marked as abandoned because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jun 07 '24 06:06 github-actions[bot]

Same issue

f1lint avatar Jun 07 '24 15:06 f1lint

for the time being, I created a custom helper/timer in HA (30 seconds). I trigger it from NR, wait 60 sec for it to finish, and it it does not, I restart NR (via a webhook running on the host) image image

bogorad avatar Jun 27 '24 18:06 bogorad

@bogorad - would you paste your flow json for that automation? Thanks for the help on a workaround. Stinks this isn't getting a fix.

kars85 avatar Jul 02 '24 16:07 kars85

sure. I simplified it. now there's an "automation" in HA to trigger timer every 30 seconds, and all I have to do in NR is to wait for the timer to stop "starting" for 1 min.

image image image

https://pastebin.com/V16vChpA

bogorad avatar Jul 02 '24 17:07 bogorad