err-backend-mattermost icon indicating copy to clipboard operation
err-backend-mattermost copied to clipboard

Connection lost permanently on TCP RST

Open a-gerhard opened this issue 1 year ago • 2 comments

We have an issue that our bot becomes unresponsive after a restart of mattermost's reverse proxy.

Steps to reproduce:

  • setup up mattermost behind nginx reverse proxy; connect bot through the reverse proxy. Both mattermost and nginx are running inside docker, with http/s ports directly published from the container.
  • start bot and connect to mattermost instance (verify it's working).
  • restart the nginx server on mattermost (docker-compose restart nginx)

What we'd expect

  • The bot tries to reconnect to mattermost

What happens

  • we see a log entry showing that the connection was closed
  • however, the bot doesn't attempt to reconnect
  • the bot also doesn't exit, process stays active
  • there are no ping/pong message log entries anymore in this state
  • a restart of the bot is required to solve the issue

From the log:

2023-03-16 20:03:06,274 DEBUG    websockets.client         % sending keepalive ping
2023-03-16 20:03:06,274 DEBUG    websockets.client         > PING 37 8d 3d 0e [binary, 4 bytes]
2023-03-16 20:03:06,277 DEBUG    websockets.client         < PONG 37 8d 3d 0e [binary, 4 bytes]
2023-03-16 20:03:06,278 DEBUG    websockets.client         % received keepalive pong
2023-03-16 20:03:17,949 DEBUG    websockets.client         ! failing connection with code 1006
2023-03-16 20:03:17,955 DEBUG    websockets.client         = connection is CLOSED
2023-03-16 20:03:17,956 DEBUG    websockets.client         x closing TCP connection

(no related messages in the log afterwards)

a-gerhard avatar Apr 06 '23 11:04 a-gerhard

Python 3.9.2 venv using pypi packages:

errbot                  6.1.9
mattermostdriver        7.3.2

We've noticed this issue today when our mattermost host crashed. Our bot kept running but the websocket was never restored.

The mattermost machine was forcefully shutdown at 09:06:50~

2023-04-10 09:06:12,201 DEBUG    websockets.client         > PING c8 26 8b 73 [binary, 4 bytes]
2023-04-10 09:06:12,203 DEBUG    websockets.client         < PONG c8 26 8b 73 [binary, 4 bytes]
2023-04-10 09:06:12,203 DEBUG    websockets.client         % received keepalive pong
2023-04-10 09:06:20,044 DEBUG    websockets.client         > PONG '' [0 bytes]
2023-04-10 09:06:32,217 DEBUG    websockets.client         % sending keepalive ping
2023-04-10 09:06:32,218 DEBUG    websockets.client         > PING fc 27 ea 57 [binary, 4 bytes]
2023-04-10 09:06:50,064 DEBUG    websockets.client         > PONG '' [0 bytes]
2023-04-10 09:06:52,221 DEBUG    websockets.client         ! timed out waiting for keepalive pong
2023-04-10 09:06:52,222 DEBUG    websockets.client         ! failing connection with code 1011
2023-04-10 09:06:52,223 DEBUG    websockets.client         = connection is CLOSING
2023-04-10 09:06:52,223 DEBUG    websockets.client         > CLOSE 1011 (unexpected error) keepalive ping timeout [24 bytes]
2023-04-10 09:07:02,234 DEBUG    websockets.client         ! timed out waiting for TCP close
2023-04-10 09:07:02,235 DEBUG    websockets.client         x closing TCP connection
2023-04-10 09:07:12,246 DEBUG    websockets.client         ! timed out waiting for TCP close
2023-04-10 09:07:12,247 DEBUG    websockets.client         x aborting TCP connection
2023-04-10 09:07:12,248 DEBUG    websockets.client         = connection is CLOSED
  • No related messages afterwards, just the absence of the keepalive pings stood out.
  • Restarting the bot will restore the connection.

shmoun avatar Apr 10 '23 21:04 shmoun

Found this issue in the mattermostdriver repo

Now running with the patched websocket.py to see if the websocket survives a reboot of the mattermost host.

This custom websocket.py depends on aiohttp.

shmoun avatar Apr 12 '23 20:04 shmoun