node-restify
node-restify copied to clipboard
When using handleUncaughtExceptions, if an error is thrown within a stream callback in the `uncaughtException` handler, restify enters an infinite loop.
- [x] Used appropriate template for the issue type
- [x] Searched both open and closed issues for duplicates of this issue
- [x] Title adequately and concisely reflects the feature or the bug
Bug Report
Restify Version
Tested on 7.7.0
, almost definitely present in all versions due to underlying Node issue.
Node.js Version
v10.15.1
, likely present in all versions of Node.
Expected behaviour
No infinite loop
Actual behaviour
Infinite loop
Repro case
const restify = require('restify');
const http = require('http');
const server = restify.createServer({ handleUncaughtExceptions: true });
server.on('uncaughtException', () => {
console.log('catching exception!');
http.get('http://google.com', () => { throw new Error('foobar'); });
});
server.get('/', () => { throw new Error('fail'); });
server.listen(2000, () => console.log('Server now up on port 2000'));
Cause
This is caused by this issue in node.
Are you willing and able to fix this?
Yes.
There is a workaround available for this issue, which would be to change on
to once
here. If the restify team is on board with this change, I'm happy to submit a PR.
One of my concerns with this is that it makes this problematic behavior silent. Would it be useful to know whether the error handler actually throws or emit an error than is not handled?
If so, would setting up a custom error event handler that:
- wraps the error event handler and somehow tries to report an error (publish a metric somewhere) first, assuming that error reporting code would be simpler/more robust than the original error handling code
- if called again, turns to a no-op
be a better trade-off between silencing all errors that occur in the error handler and consuming resources without bounds?
Another question that I have is: is it possible for an error to be emitted or thrown in the domain error handler before resources associated with the request/response are released? If so, would those resources ever be released if the error handler is set using once
?
Would it be useful to know whether the error handler actually throws or emit an error than is not handled?
Yes, I definitely think that's ideal.
If so, would setting up a custom error event handler that:
- wraps the error event handler and somehow tries to report an error (publish a metric somewhere) first, assuming that error reporting code would be simpler/more robust than the original error handling code
- if called again, turns to a no-op
be a better trade-off between silencing all errors that occur in the error handler and consuming resources without bounds?
That does sound like a reasonable proposal to me. Any idea what you'd think such an API would look like?
Another question that I have is: is it possible for an error to be emitted or thrown in the domain error handler before resources associated with the request/response are released? If so, would those resources ever be released if the error handler is set using
once
?
That I'm not sure of. I could do more testing on Node to get a lower-level understanding. Regardless, I think that it would be ideal to solve this such that we can ensure that Restify always releases resources, if possible.
@cprussin I'm thinking that prescribing/enforcing any behavior in restify is going to be tricky, because it seems difficult to determine what the best way is for restify consumers to handle those edge cases.
Is it possible for you to implement those ideas in a new domain within that restify domain error handler?
We can definitely do that @misterdjules. Being that that's the case, Restify using once
instead of on
would cause restify to not enter an infinite loop, but would still allow consumers to write code that catch this case. So, should we change restify or leave it as is?
@cprussin I'm not sure anymore that once
is actually desirable in all cases.
For instance, it seems like a legitimate use case for restify's uncaughtException handler is to log errors that are emitted on the req and res objects, or any other emitter created in the context of req/res' domain. Using once
would allow only one error event to be logged.
Thoughts on that?
@misterdjules yeah, that's a good point.
Depending on what happens with the upstream node issue, we could also consider changing restify to create a new domain and call _onHandlerError
within that domain. That is, change https://github.com/restify/node-restify/blob/4900d6bdd51fa4e1769678562de69929c38a0c4b/lib/server.js#L829-L831 to:
handlerDomain.on('error', function onError(err) {
var errorDomain = domain.create();
errorDomain.on('error', function onError(err) {
// Do something with err, probably including telling app
// authors how they can handle this more gracefully...
});
errorDomain.run(function run() {
self._onHandlerError(err, req, res, true);
});
});
That way, we avoid an infinite loop in restify, but apps can still do error detection within the uncaught exception handler (by creating yet another new domain...).
@cprussin Very interesting!
Would the errorDomain
's error handler include potentially making sure that the request/response's resources are closed if err is a thrown error (as opposed to an emitted error)?
I presume it could at the very least log an error too, even though I wonder if that would drown into a sea of logs and turn out to be not that useful.
@misterdjules I would think it likely should do both of those, but OTOH is there an argument to be had that in this scenario, we should kill the process?
@cprussin I don't know about killing the process, since the goal of using domains in the first place is to not exit on uncaught exceptions. I'm starting to think that all this should be left to the consumer of Restify because they have the most context on how they want their application to behave.
That is valid for https://github.com/nodejs/node/pull/26211 too, and so I'm starting to think that is not a desirable change too.
Regardless, we could definitely improve Restify's and domain's documentation by documenting those edge cases and available workarounds.
Thoughts?
@cprussin I don't know about killing the process, since the goal of using domains in the first place is to not exit on uncaught exceptions. I'm starting to think that all this should be left to the consumer of Restify because they have the most context on how they want their application to behave.
That is valid for nodejs/node#26211 too, and so I'm starting to think that is not a desirable change too.
@misterdjules I agree that the ultimate insights into that an exception occurred should be up to the app, but I think that having a mode where the tool can enter an infinite loop unexpectedly (or at least, due to a condition that isn't obvious) is a problem--entering an unexpected infinite loop is a worse condition than crashing the process anyways. However, the more we chat about this, the more I think that the problem isn't Restify's to solve. Because of that, I'm actually really in favor of your change to Node, but I still think we should document those edge cases when using the restify flag so that consuming apps know how to account for these cases.
So my viewpoint on this is:
- Node shouldn't have a condition where it enters an infinite loop due to some nonobvious condition. Your PR will fix that, but will cause cases where
handleUncaughtExceptions
isn't sufficient to keep the server from crashing. - I believe the purpose of the
handleUncaughtExceptions
flag is to indicate that uncaught exceptions on the request should be handled. I don't think it's up to restify to handle all possible exceptions. There will be edge cases to what conditions restify can handle exceptions under (for instance, what if an exception is thrown in Restify before we even enter the domain?). Point 1 will add such an edge case. - To combat 2, we should add documentation in Restify for whatever edge cases exist in uncaught exception handling, and how to write a robust and stable error handler while using that flag.