shelf icon indicating copy to clipboard operation
shelf copied to clipboard

Suggestion: always run serve_requests in `runZoneGuarded`

Open annagrin opened this issue 4 years ago • 22 comments

serveRequests does not always wrap listening to http requests in a runZoneGuarded wrap:

https://github.com/dart-lang/shelf/blob/6e38575985a8094f4ac172228a6711196d9ae91e/lib/shelf_io.dart#L73 https://github.com/dart-lang/shelf/blob/6e38575985a8094f4ac172228a6711196d9ae91e/lib/src/util.dart#L16

This leads to hard to debug async exceptions later if something up its call chain adds its own runZoneGuarded wrapper - in that case exceptions stop being caught and ignored inside serveRequests and fall to the next root zone exception handler, which unexpectedly changes to logic of the calling code. Note that adding such top level handler will change behavior of all serveRequests calls, including the ones coming from imported libraries, which makes fixing those new issues particularly complicated.

Can we always run code inside serveRequests in a runZoneGuarded, and provide it with an error handler instead for the cases where error behavior needs to be controlled by the user?

annagrin avatar Sep 28 '21 20:09 annagrin

I think I buy that. Thoughts @natebosch @jakemac53 @lrhn ?

kevmoo avatar Sep 28 '21 20:09 kevmoo

@natebosch looked at pulling out the error handling to do something different during the null safety push and we never got anywhere.

Might be worth revisiting, though...

kevmoo avatar Sep 28 '21 20:09 kevmoo

Do we know what the rationale for the existing behavior was? It does seem problematic to me, but it also seems like somebody went to a fair bit of effort to make it behave that way?

jakemac53 avatar Sep 28 '21 20:09 jakemac53

@nex3 had good reasons, I'm sure – perhaps she can shed some light?

kevmoo avatar Sep 28 '21 20:09 kevmoo

Yeah the current behavior is magic and confusing - the hidden interactions with error zones does cause problems.

https://github.com/dart-lang/shelf/issues/2#issuecomment-742081585

My preference is to remove anything related to error zones in this package, but we'd need to be careful rolling it out.

natebosch avatar Sep 28 '21 20:09 natebosch

The OTHER idea.

Chatting with a community member about this.

Deprecate shelf_io.dart and create another, more opinionated package to handle the "wire up the server" bits.

kevmoo avatar Sep 28 '21 20:09 kevmoo

I am good with anything that does not change behavior if a top level handler is added, we would need to update a bunch of existing libraries in that case though. My current case - ddr has a top level handler that has been added to collect crash metrics, but it runs a bunch of http servers via libraries such as dds, dwds, devtools, build daemon etc.

annagrin avatar Sep 28 '21 20:09 annagrin

If I recall correctly, the idea behind the current error handling was to be a backstop in the case where the user hadn't thought about error handling at all. Crashing the server on an error is about the worst possible behavior, so the current behavior says "if the server would otherwise crash, log an error instead, otherwise leave it up to the user". The expectation was that most servers would do more thorough error-handling, either by adding middleware that captured errors or by having some app-wide error handler (that is to say, running the server in a pre-existing error zone).

I think the root issue here may be that it wasn't clear enough that users should set up their own explicit error-handling. You could solve this in a non-breaking way by adding a message to the default error log saying "set up better error handling", and possibly also providing a built-in error-handling middleware.

nex3 avatar Sep 28 '21 20:09 nex3

Unfortunately it introduced the opposite behavior from the desired one in this case - if all serveRequests calls in all the libraries are used unwrapped and then someone adds a top level wrapper, server starts effectively crashing, because the top level zone was just added as a way to catch anything, print in nicely to the screen and exit. So it looks like the only safe way to use this API in a library is by wrapping it in runZoneGuarded with error handling, because there is always a chance for that top wrapper addition in the calling code later.

So it looks to me that the current API is incomplete, because it does not require error handling.

Maybe we can introduce another one that does not change behavior depending on calling code?

We can make it very similar to serveRequests but add an error handler that is passed to the zone that serveRequests unconditionally runs in. We can set the default error handler to do what it currently does - print 'Async error' and exit.

Can even eventually mark the old one as deprecated if needed.

annagrin avatar Sep 28 '21 22:09 annagrin

@nex3

You could solve this in a non-breaking way by adding a message to the default error log saying "set up better error handling", and possibly also providing a built-in error-handling middleware.

This would be great, but in this case we need to inform the user where the error came from. The errors are asynchronous and look like "SocketException: OSError: broken pipe" with an unrelated stack at best.

Maybe we can remember the stack at the call to serveRequests and add it to the exception to propagate to the lower runZoneGuarded?

There is also a question how to handle those errors coming from other libraries, where running an API that starts a server in runZoneGuarded is too late to properly recover from broken socket errors.

annagrin avatar Sep 28 '21 22:09 annagrin

This would be great, but in this case we need to inform the user where the error came from. The errors are asynchronous and look like "SocketException: OSError: broken pipe" with an unrelated stack at best.

Are you using stack chains in your global error handler? That can go a long way towards helping to clarify the real source of an error.

The trick here is that you want to balance three things:

  1. Having graceful error-handling behavior by default if the user doesn't configure anything special.
  2. Allowing the user to configure their own totally custom error handling.
  3. Minimizing nonlocal behavior where a global error handler affects the way a server's errors are handled.

nex3 avatar Sep 28 '21 23:09 nex3

Are you using stack chains in your global error handler?

Thanks, I wasn't, added them to wrap serveRequests call.

Agree with all 3 items!

Btw, what would be the best default error handling in the case of serveRequests? Does always ignoring them work for both the server and the client? We have hard to reproduce socket exceptions happening intermittently, and inducing them manually in the socket code and always ignoring them in the runZonedGuarded wrapper sometimes leaves the server unresponsive.

annagrin avatar Sep 29 '21 17:09 annagrin

I'd say ignoring errors is almost always the wrong behavior. I'd say either always logging, or always throwing a nicely-formatted error is probably the best option.

nex3 avatar Sep 29 '21 20:09 nex3

Makes sense, I meant logging and ignoring in this case. I wonder if there is any way to filter errors that are likely to render the server unusable in which case we probably should throw.

annagrin avatar Sep 29 '21 21:09 annagrin

Linked all cases that can potentially contribute to ddr crashes.

annagrin avatar Sep 29 '21 22:09 annagrin

Fix I used in dwds and webdev:

/// Handles [requests] using [handler].
/// 
/// Captures all sync and async stack error traces and passes
/// them to the [onError] handler.
void serveHttpRequests(Stream<HttpRequest> requests, Handler handler,
    void Function(Object, StackTrace) onError) {
  return Chain.capture(() {
    serveRequests(requests, handler);
  }, onError: onError);
}

annagrin avatar Sep 29 '21 22:09 annagrin

Should we just add a boolean parameter here that forces it it handle and log all async errors instead of only doing so if there is no existing custom error handling zone? forceHandleAndLogUncaughtErrors or maybe something a bit more concise?

jakemac53 avatar Sep 30 '21 14:09 jakemac53

@jakemac53 how about the boolean flag forceHandleAndLogUncaughtErrors and a custom error handler, onError(e,s)?

annagrin avatar Oct 04 '21 23:10 annagrin

Sketch out a PR and we can review?

On Mon, Oct 4, 2021 at 4:30 PM Anna Gringauze @.***> wrote:

@jakemac53 https://github.com/jakemac53 how about the boolean flag forceHandleAndLogUncaughtErrors and a custom error handler, onError(e,s)?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dart-lang/shelf/issues/202#issuecomment-933933123, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAEFCVAKWPHQ7E2XD4D373UFI2IVANCNFSM5E6HYHSA .

kevmoo avatar Oct 05 '21 02:10 kevmoo

@jakemac53 how about the boolean flag forceHandleAndLogUncaughtErrors and a custom error handler, onError(e,s)?

Or possibly we could just add the onError handler parameter, and if that is passed then we assume you want to handle all errors?

jakemac53 avatar Oct 05 '21 15:10 jakemac53

even better! I will submit a PR

annagrin avatar Oct 05 '21 17:10 annagrin

Sent a sketch. Let me know what you think!

annagrin avatar Oct 05 '21 18:10 annagrin