nest
nest copied to clipboard
Graceful shutdowns - WebSocketGateways ability to run async clean up logic per connection on shutdown
Is there an existing issue that is already proposing this?
- [X] I have searched the existing issues
Is your feature request related to a problem? Please describe it
I have some logic that runs on handleDisconnect. When a user disconnects it scrubs some information from Redis and MongoDB .
I need both of these connections to remain open while I do my clean up work.
So I enable Shutdown hooks for graceful shutdowns. When server receives termination signal, it disconnects all users who are connected to the websocket server, which invokes handleDisconnect for all users. Perfect. But it doesn't 'wait' for my work to complete before it proceeds to shutdown the Redis and MongoDB connections. My handleDisconnect methods are async though this doesn't seem to help.
The end result is when a server shuts down, all users who are currently connected will 'leak' information into Redis and MongoDB as I didn't get a chance to clean it. My code is 'self-repairing' and this leaked information will eventually be cleared up, but this is just a safety-net and until the data is cleaned up there are odd quirks (like a user appears to be in a chat lobby but they are no longer there). I do not want to rely on this behaviour every time my server restarts (Which is atleast once a day as we host on Heroku)
Describe the solution you'd like
I'd like to be able to perform some async clean up work for each client that is being disconnected as a result of server shutdown. I still need access to external services like Redis and MongoDB.
Teachability, documentation, adoption, migration strategy
What is the motivation / use case for changing the behavior?
(This is a simplified example)
- A WebSocketGateway chat server. Lobbies are managed on MongoDB
- Users who connect are assigned a random lobby - the change is written to MongoDB
- Users who disconnect are removed from their lobbies - the changes are written to MongoDB
- With shutdown hooks/graceful shutdown enabled, the WebSocketGateway is able to disconnect each user one by one and perform the async disconnect operation on each one prior to termination.
My current solution is as follows (I am testing it now, unsure if it will work in practice but will keep posted...):
- custom adaptor class (
RedisIoAdapter)
app.useWebSocketAdapter(new RedisIoAdapter(app, rediosIoAdaptorConfig))
- Adaptors do not receive lifecycle events. I need to perform my clean up in
beforeApplicationShutdownas this is the only time we have before everything gets disconnected (MongoDB etc) by NestJS. - Workaround...
RedisModulehas a list ofRedisIoAdaptors. RedisIoAdapter.ctorsearches forRedisModuleand add's itself to the list
constructor(
app: INestApplicationContext,
private config: RedisIoAdaptorConfig
) {
....
app.resolve(RedisModule).then((module) => {
module.registerIoAdapter(this);
});
}
RedisModuleimplementsbeforeApplicationShutdownRedisModule.beforeApplicationShutdowniterates list ofRedisIoAdaptorsand invokesbeforeApplicationShutdownon each one... Workaround complete...RedisIoAdapter.createIOServerstores a reference to all the servers it has created.
private servers: Server[] = [];
createIOServer(port: number, options?: ServerOptions): any {
this.logger.log("Creating io server...");
...
const server = super.createIOServer(port, options) as Server;
server.adapter(this.adapterConstructor);
...
this.servers.push(server);
return server;
}
RedisIoAdapter.beforeApplicationShutdowniterates each server andcloses each one.- We just wait some amount of time to give our clean up tasks a chance to complete... (hack... Ideally the
WebSocketServers could provide some way to signal that they are done cleaning up..)
async beforeApplicationShutdown() {
this.logger.log("Shutting down...");
for(const server of this.servers){
server.close();
}
//wait 5 seconds to allow handleDisconnect's to finish doing their work...
await new Promise((resolve) => setTimeout(resolve, 5000));
}
The above approach works for my case, but is pretty hacky and mainly just a proof of concept. The biggest issue is there is no nice way to 'know' if all the WebSocketGateways that have clean up to do have finished cleaning up. In my proof of concept I just wait 5 seconds, but in the real solution it would be nice if the async clean up logic was just awaited.
@timbo-tj Can you send a link to the repo or a repo with the issue replicated, so that I can take a look and replicate it and get a feel for what you are getting at?
Hey Joseph! Sorry I don't know how I missed your message. I will look into putting together a repo at some point, if I can find the time. Thanks!!