twilight icon indicating copy to clipboard operation
twilight copied to clipboard

Gateway tries to read from stream after closure on resumes

Open Gelbpunkt opened this issue 3 years ago • 1 comments

Resumes currently log something like this:

2022-01-08T12:06:22.516997Z DEBUG twilight_gateway::shard::processor::r#impl: got request to reconnect
2022-01-08T12:06:22.517015Z DEBUG twilight_gateway::shard::processor::r#impl: resuming shard [91, 240]
2022-01-08T12:06:22.638171Z  WARN twilight_gateway::shard::processor::socket_forwarder: socket errored: IO error: unexpected end of file
2022-01-08T12:06:22.638190Z DEBUG twilight_gateway::shard::processor::socket_forwarder: Leaving loop
2022-01-08T12:06:22.724225Z DEBUG twilight_gateway::shard::processor::r#impl: Shook hands with remote
2022-01-08T12:06:22.724254Z DEBUG twilight_gateway::shard::processor::socket_forwarder: starting driving loop
2022-01-08T12:06:22.724293Z DEBUG twilight_gateway::shard::processor::r#impl: got hello with interval 41250
2022-01-08T12:06:22.724298Z DEBUG twilight_gateway::shard::processor::r#impl: resuming with sequence 4284, session id 219740e6a206b8844428c86927b60cb8

This indicates that twilight knows it was requested to reconnect, but still attempts to read from the closed TCP stream. This should not happen, for two reasons:

  1. It results in a lot of verbose logging the more shards are created
  2. Dropping the log level might suppress "real" I/O related issues

I assume this happens because the socket forwarder isn't stopped in time (https://github.com/twilight-rs/twilight/blob/main/gateway/src/shard/processor/socket_forwarder.rs#L84).

Gelbpunkt avatar Jan 08 '22 12:01 Gelbpunkt

Assigning to self as this will likely be fixed as a result of my gateway refactor.

spring4175 avatar Mar 29 '22 21:03 spring4175

Fixed in next, Twilight 0.14

vilgotf avatar Sep 27 '22 06:09 vilgotf

Actually the issue is due to a bug in tokio-rustls (see snapview/tungstenite-rs#316 for some more research into the issue). The rewrite merely sidesteps this issue by disconnecting the WebSocket connection prior to what the WebSocket protocol demands. Note that Discord somewhat says that this is okay (https://discord.com/developers/docs/topics/gateway#initiating-a-disconnect):

If you simply close the TCP connection or use a different close code, the session will remain active and timeout after a few minutes. This can be useful when you're Resuming the previous session.

vilgotf avatar Dec 01 '22 14:12 vilgotf

@vilgotf was this meant to be reopened?

spring4175 avatar Jan 09 '23 16:01 spring4175

Yes. There's a hack in the gateway that works around this but I think this should be opened until the root issue is fixed (as a reminder).

vilgotf avatar Jan 09 '23 16:01 vilgotf

Fix available in 0.15.0 and 0.14.2 🎉

vilgotf avatar Feb 06 '23 13:02 vilgotf