nest
nest copied to clipboard
Disgraceful shutdown, request before signal interrupted with ECONNREFUSED
Is there an existing issue for this?
- [X] I have searched the existing issues
Current behavior
When the application is shutdown (e.g. with SIGTERM shutdown hook), requests that started before shutdown fail with ECONNREFUSED.
Minimum reproduction code
https://github.com/soryy708/nestjs-disgraceful-shutdown-mcve
Steps to reproduce
- Install dependencies by running
yarn install - Run the tests by running
yarn test
Expected behavior
Requests that started before shutdown should succeed, only new requests after shutdown should fail.
Package
- [ ] I don't know. Or some 3rd-party package
- [X]
@nestjs/common - [X]
@nestjs/core - [ ]
@nestjs/microservices - [X]
@nestjs/platform-express - [ ]
@nestjs/platform-fastify - [ ]
@nestjs/platform-socket.io - [ ]
@nestjs/platform-ws - [X]
@nestjs/testing - [ ]
@nestjs/websockets - [ ] Other (see below)
Other package
No response
NestJS version
9.3.12
Packages versions
{
"@nestjs/axios": "^2.0.0",
"@nestjs/common": "^9.3.12",
"@nestjs/core": "^9.3.12",
"@nestjs/platform-express": "^9.3.12",
"@nestjs/terminus": "^9.2.2",
"@nestjs/testing": "^9.3.12",
}
Node.js version
16.19.0
In which operating systems have you tested?
- [X] macOS
- [ ] Windows
- [ ] Linux
Other
No response
I believe the error lies somewhere here or in one these methods. Somewhere in one of these, the request is being stopped


So, I could be wrong, but what I have found is that the callDestroyHook method is iteratively calling callModuleDestroyHook and that is what is removing the modules via the shift method on the list of providers. So, what my intuition is telling me, is that the provider is being removed while the controller is utilizing it and thus the request is just being stopped. Again, I could be wrong, but I think this is a potential root cause.
Hello, what's the update on this issue??
I think that the problem lies on the express adapter itself, on the close method.
public close() {
this.closeOpenConnections();
if (!this.httpServer) {
return undefined;
}
return new Promise(resolve => this.httpServer.close(resolve));
}
As you can see, this close method closes all the openConnections before calling this.httpServer.close.
As you can see in the express documentation (https://expressjs.com/en/advanced/healthcheck-graceful-shutdown.html), the close method on the httpServer is supposed to wait for all connections to be terminated. So technically I think that if we remove this.closeOpenConnections(), it would just work as expected.
https://github.com/nestjs/nest/blob/master/packages/platform-express/adapters/express-adapter.ts#L159-L166
cc @tolgap 👀
@coderhammer this method call won't have any impact if forceCloseConnections is false (which is the default)
Honestly I'm not sure, I just tried on a new project and it closes open connections with the default. Here is the current implementation of the closeOpenConnections method:
private closeOpenConnections() {
for (const socket of this.openConnections) {
socket.destroy();
this.openConnections.delete(socket);
}
}
https://github.com/nestjs/nest/blob/f676276979e651f6258b5bc42666e9aa7fe3dbf4/packages/platform-express/adapters/express-adapter.ts#LL429C1-L434C4
A quick fix for me for now is to implement my own OnModuleDestroy function to wait for all the connections to be closed. This will wait before calling nest's implementation on the http adapter.
import { Injectable, Logger, OnModuleDestroy } from '@nestjs/common';
import { HttpAdapterHost } from '@nestjs/core';
@Injectable()
export class AppService implements OnModuleDestroy {
constructor(private readonly httpAdapter: HttpAdapterHost) {}
async onModuleDestroy() {
Logger.debug('Module destroyed called!');
const httpServer = this.httpAdapter.httpAdapter.getHttpServer();
await new Promise((resolve) => httpServer.close(resolve));
Logger.debug('Waited for all connections to be closed!');
}
}
EDIT: The behaviour is happening when activating app.enableShutdownHooks() I don't know if this is relevant
Honestly I'm not sure, I just tried on a new project and it closes open connections with the default. Here is the current implementation of the closeOpenConnections method
this.openConnections will always be empty unless trackOpenConnections (https://github.dev/nestjs/nest/blob/da27730cd4ef37fdf7f29aafee53ed8ac2a8b4b8/packages/platform-express/adapters/express-adapter.ts#L421) is called and it's executed only if forceCloseConnections is true https://github.dev/nestjs/nest/blob/da27730cd4ef37fdf7f29aafee53ed8ac2a8b4b8/packages/platform-express/adapters/express-adapter.ts#L234
I'm so sorry, it works exactly as expected when trying again on a fresh project. I was trying something with a special configuration. Have a nice day! This is not an issue for me anymore.
@coderhammer what did you change? Is there a workaround?
Honestly, everything works fine on my side when testing manually. I tried your reproduction repo but I think that it's more a test configuration that is failing :/
@coderhammer If you think the issue is with test configuration, where do you think the issue is?
I thought it could be that Test.createTestingModule from @nestjs/testing is bugged, so I added a test suite that doesn't use it and instead uses NestFactory from @nestjs/core, but the bug still happens.
I don't think it's feasible to reproduce this manually (without an automated test), because the timeframe in which this bug occurs is very small, but is enough to be experienced by users in a production system under load.
Also I don't think that app.enableShutdownHooks is related, because in the MCVE tests I linked if I comment out enableShutdownHooks it still fails with the same error. (clearly that means my MCVE wasn't minimal. Whoops!)
After investigating a little bit on your test repo, it seems to me that the problem comes from supertest itself. I'll try to make a reproduction later this week
@coderhammer Looks like Supertest may be related, but is not the only issue. I added to the MCVE a suite that does the same tests but without Supertest.
The same test case (shutting down while a request is in progress) fails on ECONNREFUSED.
However so does making a request after the server shut down (about which I don't care if it's 503 or ECONNREFUSED, the server finished shutting down).
@soryy708 could you spin up a pure express application (non-NestJS-driven) just to see what's the original behavior for your test?
With the simplest possible Express application (without even Terminus) it acts the same as with Nest not using Supertest.
const express = require('express');
function createServer() {
const app = express();
app.get('/health', async (_req, res) => {
res.status(200).send();
})
return new Promise((resolve, reject) => {
const server = app.listen(0, error => {
if (error) {
reject(error);
return;
}
resolve([server.address().port, () => server.close()]);
});
})
}
describe("API main (no Supertest)", () => {
describe("Lifecycle", () => {
let port;
let close;
beforeEach(async () => {
[port, close] = await createServer();
});
afterEach(async () => {
await close();
});
const probeReadiness = async () => {
const request = new Request(`http://localhost:${port}/health`, { method: 'GET' });
const response = await fetch(request);
return { statusCode: response.status };
}
describe("When bootstrapped", () => {
describe("When running", () => {
it("Should respond successfully", async () => {
const response = await probeReadiness();
expect(response.statusCode).toBe(200);
});
});
describe("When shutting down", () => {
describe("When a request is already in progress", () => {
it("Should respond successfully", async () => {
const responsePromise = probeReadiness();
const closePromise = close();
const [response] = await Promise.all([
responsePromise,
closePromise,
]);
expect(response.statusCode).toBe(200);
});
});
describe("When a new request is made", () => {
it("Should respond with 503 Service Unavailable", async () => {
await close();
const response = await probeReadiness();
expect(response.statusCode).toBe(503);
});
});
});
});
});
});
Result:
FAIL ./index.test.js
Lifecycle
When bootstrapped
When running
√ Should respond successfully (49 ms)
When shutting down
When a request is already in progress
× Should respond successfully (8 ms)
When a new request is made
× Should respond with 503 Service Unavailable (3 ms)
Both on ECONNREFUSED.
I expect that if I use Terminus, it will postpone Express'es shutdown long enough to flush the in-progress requests.
If this behavior can be reproduced with the pure express application, then I don't think there's anything we can/should do given we don't really want to alter the default driver's behavior. Perhaps reporting this issue in the terminus package would make more sense then?
Are we sure that it's an issue with the Terminus package?
I'm opening an issue in https://github.com/nestjs/terminus.
Given that this is the default behavior of HTTP drivers we rely on (Nest is built upon), I wouldn't even call this an "issue" but rather a "feature request". Whether it should be implemented & exposed from the terminus package, or an entirely new package that extends the default driver and adds some extra functionality on top, is a different question.
Whatever you want to call it.
It's the default behavior of Express without anything, because Express is not a framework like Nest is.
Express docs tell you that if you want graceful shutdown, you need to add another thing to Express (e.g. @godaddy/terminus, which @nestjs/terminus shares a name with but doesn't depend on).
https://expressjs.com/en/advanced/healthcheck-graceful-shutdown.html
The idea is to observe for SIGINT signals, and when it happens, to have the health check respond with failures, and delay Express'es server.close() until after all pending requests were handled.
It's the default behavior of Express without anything, because Express is not a framework like Nest is.
This doesn't imply we should be changing the driver's default behavior, especially by default.
Express docs tell you that if you want graceful shutdown, you need to add another thing to Express (e.g. @godaddy/terminus, which @nestjs/terminus shares a name with but doesn't depend on).
So I'd say exposing this feature from the @nestjs/terminus package would make sense.
@kamilmysliwiec I have done some investigation on this.
It seems Fastify has an "option" for handling things like this. They introduce an option called return503OnClose: https://github.com/fastify/fastify/blob/main/lib/route.js#L437-L449
Then before Fastify performs the close logic, they return a 503 on any open request in their route handler.
I can see how this issue pops up for regular health checks. But this can also be useful for SSE! I think we should consider keeping this logic in @nestjs/platform-express, but gate it behind an option. Just like Fastify does. We can then introduce this option on the application level. But if you're of the opinion that we should handle this in @nestjs/terminus, we can keep it that way!
I can see how this issue pops up for regular health checks. But this can also be useful for SSE! I think we should consider keeping this logic in @nestjs/platform-express, but gate it behind an option. Just like Fastify does. We can then introduce this option on the application level. But if you're of the opinion that we should handle this in @nestjs/terminus, we can keep it that way!
Either way works for me. Since it can be useful for SSE too (and we expose abstractions for SSE from the core) then having it in the platform-specific packages would make sense too!
Hey @kamilmysliwiec Do you have an ETA for this feature request?
Have you tried using this package https://www.npmjs.com/package/nestjs-graceful-shutdown?