twilight
twilight copied to clipboard
Gateway tries to read from stream after closure on resumes
Resumes currently log something like this:
2022-01-08T12:06:22.516997Z DEBUG twilight_gateway::shard::processor::r#impl: got request to reconnect
2022-01-08T12:06:22.517015Z DEBUG twilight_gateway::shard::processor::r#impl: resuming shard [91, 240]
2022-01-08T12:06:22.638171Z WARN twilight_gateway::shard::processor::socket_forwarder: socket errored: IO error: unexpected end of file
2022-01-08T12:06:22.638190Z DEBUG twilight_gateway::shard::processor::socket_forwarder: Leaving loop
2022-01-08T12:06:22.724225Z DEBUG twilight_gateway::shard::processor::r#impl: Shook hands with remote
2022-01-08T12:06:22.724254Z DEBUG twilight_gateway::shard::processor::socket_forwarder: starting driving loop
2022-01-08T12:06:22.724293Z DEBUG twilight_gateway::shard::processor::r#impl: got hello with interval 41250
2022-01-08T12:06:22.724298Z DEBUG twilight_gateway::shard::processor::r#impl: resuming with sequence 4284, session id 219740e6a206b8844428c86927b60cb8
This indicates that twilight knows it was requested to reconnect, but still attempts to read from the closed TCP stream. This should not happen, for two reasons:
- It results in a lot of verbose logging the more shards are created
- Dropping the log level might suppress "real" I/O related issues
I assume this happens because the socket forwarder isn't stopped in time (https://github.com/twilight-rs/twilight/blob/main/gateway/src/shard/processor/socket_forwarder.rs#L84).
Assigning to self as this will likely be fixed as a result of my gateway refactor.
Fixed in next, Twilight 0.14
Actually the issue is due to a bug in tokio-rustls (see snapview/tungstenite-rs#316 for some more research into the issue). The rewrite merely sidesteps this issue by disconnecting the WebSocket connection prior to what the WebSocket protocol demands. Note that Discord somewhat says that this is okay (https://discord.com/developers/docs/topics/gateway#initiating-a-disconnect):
If you simply close the TCP connection or use a different close code, the session will remain active and timeout after a few minutes. This can be useful when you're Resuming the previous session.
@vilgotf was this meant to be reopened?
Yes. There's a hack in the gateway that works around this but I think this should be opened until the root issue is fixed (as a reminder).
Fix available in 0.15.0 and 0.14.2 🎉