alchemy icon indicating copy to clipboard operation
alchemy copied to clipboard

Gateway termiantes occasionally

Open Zarathustra2 opened this issue 4 years ago • 16 comments
trafficstars

Stacktrace

13:00:18.716 [error] ** Websocket client Alchemy.Discord.Gateway terminating in :websocket_handle/3 for the reason :error::function_clause ** Websocket message was {:text, "{"t":null,"s":null,"op":7,"d":null}"} ** Handler state was %Alchemy.Discord.Gateway.State{<DELETED>} ** Stacktrace: [{:websocket_client, :handle_response, [:ok, Alchemy.Discord.Gateway, { :websocket_req, :wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json', :infinity, :undefined, 1, { :sslsocket, {:gen_tcp, #Port<0.107>, :tls_connection, :undefined}, [#PID<0.9973.0>, #PID<0.9972.0>]}, { :transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, "SOME_CHAR_SEQUENCE_DELETED", :undefined, 1, 1, :undefined, :undefined}], [file: 'FULL_PATH_DELETED/deps/websocket_client/src/websocket_client.erl', line: 545]}, {:websocket_client, :handle_websocket_frame, 2, [file: 'FULL_PATH_DELETED/deps/websocket_client/src/websocket_client.erl', line: 500]}, {:gen_statem, :loop_state_callback, 11, [file: 'gen_statem.erl', line: 1118]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]}]

13:00:18.718 mfa=:gen_statem.error_info/7 [error] GenStateMachine #PID<0.9971.0> terminating ** (exit) :function_clause (stdlib 3.12.1) gen_statem.erl:1312: :gen_statem.loop_state_callback_result/11

14:49:25.992 mfa=:gen_statem.error_info/7 [error] GenStateMachine #PID<0.7238.0> terminating (websocket_client 1.4.2) :websocket_client.disconnected(:cast, :connect, { :context, {:websocket_req, :wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json', :infinity, :undefined, 1, {:sslsocket, {:gen_tcp, #Port<0.63>, :tls_connection, :undefined}, [#PID<0.7240.0>, #PID<0.7239.0>]}, {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, "SOME_CHAR_SEQUENCE_DELETED", :undefined, 1, 1, :undefined, :undefined}, {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, [], {:wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json'}, {Alchemy.Discord.Gateway, %Alchemy.Discord.Gateway.State{<DELETED>}}, "", false, 0})

Hey @cronokirby,

I have seen this error a lot lately. I am not sure what is causing this maybe you can help me there. Nothing important as the genserver just restarts and resumes to work but would be nice if this can be avoided! :)

Also unsure how to reproduce it... :(

Anyways, thanks for the awesome library!

Zarathustra2 avatar Feb 08 '21 20:02 Zarathustra2

I think that the websocket message is malformed, given that all its fields are null.

Without having looked at the websocket gateway in a while, my hunch would be:

  1. Our parser is too strict here
  2. Our handling of the protocol is wrong or outdated, confusing Discord

cronokirby avatar Feb 08 '21 20:02 cronokirby

This error is appearing here too but in Voice gateway.

Here's the error

15:50:27.789 [debug] Voice Gateway for ID_REDACTED terminated, reason: :function_clause

15:50:28.462 [debug] Voice Gateway for ID_REDACTED connected

15:50:28.620 [error] ** Websocket client Alchemy.Voice.Gateway terminating in :websocket_info/3
   for the reason :error::function_clause
** Last message was {:ssl_closed,
 {:sslsocket, {:gen_tcp, #Port<0.106>, :tls_connection, :undefined},
  [#PID<0.841.0>, #PID<0.840.0>]}}
** Handler state was %Alchemy.Voice.Gateway.State{
  channel: ID_REDACTED,
  discord_ip: nil,
  discord_port: nil,
  guild_id: "ID_REDACTED",
  key: nil,
  my_ip: nil,
  my_port: nil,
  session: "SESSION_REDACTED",
  ssrc: nil,
  token: "TOKEN_REDACTED",
  udp: nil,
  url: "brazil1001.discord.media:443",
  user_id: "ID_REDACTED"
}
** Stacktrace: [
  {Alchemy.Voice.Gateway, :websocket_info,
   [
     {:ssl_closed,
      {:sslsocket, {:gen_tcp, #Port<0.106>, :tls_connection, :undefined},
       [#PID<0.841.0>, #PID<0.840.0>]}},
     {:websocket_req, :wss, 'brazil1001.discord.media', 443, '/', :infinity,
      :undefined, 1,
      {:sslsocket, {:gen_tcp, #Port<0.106>, :tls_connection, :undefined},
       [#PID<0.841.0>, #PID<0.840.0>]},
      {:transport, :ssl, :ssl, :ssl_closed, :ssl_error,
       [mode: :binary, active: true, verify: :verify_none, packet: 0]},
      "__REDACTED__", :undefined, 1, :undefined, :undefined,
      :undefined},
     %Alchemy.Voice.Gateway.State{
       channel: ID_REDACTED,
       discord_ip: nil,
       discord_port: nil,
       guild_id: "ID_REDACTED",
       key: nil,
       my_ip: nil,
       my_port: nil,
       session: "SESSION_REDACTED",
       ssrc: nil,
       token: "TOKEN_REDACTED",
       udp: nil,
       url: "brazil1001.discord.media:443",
       user_id: "ID_REDACTED"
     }
   ], [file: 'lib/Voice/gateway.ex', line: 150]},
  {:websocket_client, :handle_info, 2,
   [file: '/bot/deps/websocket_client/src/websocket_client.erl', line: 448]},
  {:gen_statem, :loop_state_callback, 11, [file: 'gen_statem.erl', line: 1166]},
  {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 226]}
]

15:50:28.621 [debug] Voice Gateway for ID_REDACTED terminated, reason: :function_clause

18:22:35.140 [debug] Shard [0, 1] Disconnected from the Gateway; restarting the Gateway

After this, the bot just stops the Voice and after a while, it disconnects from the main gateway and can't reconnect again.

zLupa avatar Apr 23 '21 06:04 zLupa

I get this on occasion as well. Considering my bot is hooked up to a busy discord server I just have it watch last messages and if no message in 10 minutes then the bot entirely restarts alchemy and dependency application, it's a bit brute force and it does mean messages are lost since that time, but it's working. Would be nice to be fixed though. :-)

OvermindDL1 avatar Apr 27 '21 15:04 OvermindDL1

@OvermindDL1 Why are you watching the messages for 10 minutes and then force a restart? How is this issue impacting your bot? It restarts the bot as the gen server crashes which is only an issue if the bot is playing a media file as @zLupa stated.

I will try to look into it this week, can't promise though! :)

Zarathustra2 avatar Apr 27 '21 15:04 Zarathustra2

Why are you watching the messages for 10 minutes and then force a restart? How is this issue impacting your bot?

That indicates to the bot that its lost connection to Discord.

It restarts the bot as the gen server crashes which is only an issue if the bot is playing a media file as @zLupa stated.

I don't play media files and I still get gateway termination issues on occasion. Alchemy reconnects fine most of the time but occasionally discord messages just seem to stop, don't see a crash SASL log, etc... just nothing even though plenty is happening on Discord. Not much choice at that point but just to restart alchemy.

OvermindDL1 avatar Apr 27 '21 15:04 OvermindDL1

Hmmm, that is weird. Are you on the latest version?

Zarathustra2 avatar Apr 27 '21 15:04 Zarathustra2

Hmmm, that is weird. Are you on the latest version?

Great question, possibly not. Let me see if I can update with minimal breakages, it's a slow time on discord so it should be a good time to do so...

OvermindDL1 avatar Apr 27 '21 15:04 OvermindDL1

I was a couple minor versions behind, 0.6.4, just updated to 0.6.10, it seems to be working so no breakages, I'll keep watch. It only seems to stop getting data after being up anywhere from a few days to a few weeks (memory, process, and CPU usage do not grow as a note) so I'll try to remember to watch for it, it's easy to miss now since I have it auto-restart (preferred so people don't yell at me, ^.^; ).

Speaking of, slightly OT, does alchemy support Discord's Reply information now? I have some functionality in mind that would be useful to add around that functionality and it didn't seem to exist in 0.6.4 anywhere I could see.

OvermindDL1 avatar Apr 27 '21 15:04 OvermindDL1

@OvermindDL1 I just checked the docs it is possible:

You may create a message as a reply to another message. To do so, include a message_reference with a message_id. The channel_id and guild_id in the message_reference are optional, but will be validated if provided

I don't think send_message supports that yet but looks fairly easy to add, I will take a look at that this week as well! Ping me if I forget it

Zarathustra2 avatar Apr 27 '21 15:04 Zarathustra2

@OvermindDL1 I have opened a PR: https://github.com/cronokirby/alchemy/pull/123, waiting for merge now

Zarathustra2 avatar Apr 27 '21 16:04 Zarathustra2

@OvermindDL1 has been merged, if you set alchemy to use master instead of a version you should be able to try it out! :)

Zarathustra2 avatar Apr 27 '21 16:04 Zarathustra2

Or, use the recently published version 0.7.0

cronokirby avatar Apr 27 '21 16:04 cronokirby

Oh awesome, that was fast! You're amazing!

OvermindDL1 avatar Apr 29 '21 16:04 OvermindDL1

For note, I'm still getting a lot of these (snipped anything that looks like it might be private data):

13:12:05.575 [error] GenStateMachine #PID<0.15707.1> terminating
** (FunctionClauseError) no function clause matching in :websocket_client.disconnected/3    
    (websocket_client) ./deps/websocket_client/src/websocket_client.erl:291: :websocket_client.disconnected(:cast, :connect, {:context, {:websocket_req, :wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json', :infinity, :undefined, 1, {:sslsocket, {:gen_tcp, #Port<0.7840>, :tls_connection, :undefined}, [#PID<0.15709.1>, #PID<0.15708.1>]}, {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, "snip", :undefined, 1, 2, :undefined, :undefined}, {:transport, :ssl, :ssl, :ssl_closed, :ssl_error, [mode: :binary, active: true, verify: :verify_none, packet: 0]}, [], {:wss, 'gateway.discord.gg', 443, '/?v=6&encoding=json'}, {Alchemy.Discord.Gateway, %Alchemy.Discord.Gateway.State{seq: 970, session_id: "snip", shard: [0, 1], token: "snip", trace: ["[\"gateway-prd-main-snip\",{\"micros\":179429,\"calls\":[\"discord-sessions-snip\",{\"micros\":177864,\"calls\":[\"start_session\",{\"micros\":snip,\"calls\":[\"api-prd-main-snip\",{\"micros\":116880,\"calls\":[\"get_user\",{\"micros\":7471},\"add_authorized_ip\",{\"micros\":6594},\"get_guilds\",{\"micros\":7667},\"coros_wait\",{\"micros\":1}]}]},\"guilds_connect\",{\"micros\":2,\"calls\":[]},\"presence_connect\",{\"micros\":26694,\"calls\":[]}]}]}]"], user_id: "snip"}}, "", false, 0})
    (stdlib) gen_statem.erl:1145: :gen_statem.loop_state_callback/11   
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3

OvermindDL1 avatar Apr 29 '21 17:04 OvermindDL1

It looks like I'm able to send reply messages now, but I'm not able to receive such a message with the reply information. It looks like the Alchemy.Message struct doesn't have something that supplies a Alchemy.MessageReference in its struct.

OvermindDL1 avatar Apr 29 '21 17:04 OvermindDL1

In addition, the Alchemy.Message struct is missing the optional guild_id on it, need that to be able to reference as well. :-)

OvermindDL1 avatar Apr 29 '21 18:04 OvermindDL1