sdk icon indicating copy to clipboard operation
sdk copied to clipboard

Remove "error zones" from `Zone`.

Open lrhn opened this issue 3 years ago • 8 comments

Proposed change

Zones define the concept of “error zones” and prevents some errors from flowing from a future in one error zone to a future in another.

We will remove the restriction that errors cannot flow into futures from other error zones. The error zone will solely be used as the default target for uncaught errors. If an error is handled, everything is fine, even if the handler is in another zone.

This amounts to removing the extra code that discards or ignores listeners from other error zones.

We also deprecate Zone.inSameErrorZone and Zone.errorZone at the same time, because there really is no good reason for using them left. Error zones are not special. We don’t have to deprecate the members, but I recommend doing so. Zones are generally not intended for introspection, you should run in the zone that you are given, and create a new zone if you want control. (It would be consistent with that to even deprecate and remove the Zone.root getter, so the only zone you have direct access to is Zone.current. That's likely to be more breaking, though.)

Background

The Dart Future class interacts specially with custom Zone.handleUncaughtError implementations.

If a user-created Zone defines a custom uncaught-error handler (which includes, for example, the zone introduced by runZoneGuarded), it is considered an “error zone”. Handlers are inherited, and all nested zones inheriting the same uncaught-error handler are considered to be in the "same error zone”. So, an error zone contains all the zones that share the same uncaught-error handler.

The Zone API exposes these concepts as a Zone get errorZone getter and through a bool inSameErrorZone(Zone) method.

If a future is created inside one error zone and completed with an error, then that future will (try to) avoid propagating the error to futures created in any zone with another error zone. This only applies to the propagation of results from a future to its listeners (the callback functions added using .then/.catchError/.whenComplete).

If there are no listeners on the future from inside the same error zone, the error is reported as uncaught to its own error zone. If there are listeners from both inside and outside the error zone, only those inside gets to respond to the error, the remaining listeners are ignored.

The consequence is that when you listen on a future created in a different error zone, you risks creating a future which never completes. That’s almost invariably a bad thing. Awaiting a future which doesn’t complete will stop an async method in the middle of execution. It will never progress, never exit, and even surrounding finally blocks will never run. There have been numerous bug reports from users not understanding what is going on when that happens. (We have seen numerous examples ever since the introduction of zones and error zones.)

The Future implementation has had to add extra code to check for and ignore listeners from other error zones. In most cases, this code is pure overhead.

Inconsistencies

Most people don’t know this behavior even exists. They only discover it when something goes wrong.

People who do know that the feature exists, usually do not know exactly how it behaves. For example, it only affects futures completing with errors, and has no impact on streams or other asynchronous operations at all.

Also, the Future implementation contains some legacy “features” where the error actually does propagate to or through futures in other zones.

For example, when completing one future with another, say when the .then callback returns a future or you call Completer.complete with a future, then both futures will complete with the same result. Internally, as an accident of implementation, it creates a special link between the futures, and moves all the listeners to all be on one of the futures (the one being completed with, which should provide the result for the other one). This allows errors to skip past futures in other error zones. Example:

import "dart:async";
void main() {
  // c1.future belongs to outer error zone
  var c1 = Completer();
  runZonedGuarded(() {
    // c2.future belongs to inner error zone
    var c2 = Completer();
    c1.complete(c2.future);
    c2.completeError("Inner Error");
    c1.future.catchError((e, s) {
      // Skips c1 completely, gets result directly from c2.
      print("Inner future: $e");
    });
  }, (e, s) {
    print("Zone caught error: $e");
  });
}

This behavior wasn't intended, but was introduced at the time of implementation because the behavior of “errors and error zones” was never formally specified, and nobody was looking closely at it.

(I “fixed” that when recently refactoring Future, making the inner future’s zone count too, and had to reintroduce the “broken” behavior because the fix broke tests relying on ignoring the intermediate zone. In general, when code depends on the specific error zone behavior, it's usually depending on errors getting through where they shouldn't, not depending on futures not completing.)

Benefits of change

  • Simpler code in the implementation of Future.
  • Users won’t see futures not completing for reasons that are not clear.

Affected code

Changed behavior

Code affected by this change will see a future complete which previously did not, and it completes with an error. If such code exists, it will most likely now fail with an error.

That’s probably a good thing. I expect it to mainly affect code which is already erroneous and therefore will make some test fail that previously succeeded for the wrong reasons. It’s exceedingly rare for code to depend on a future not completing because of an upstream error happening in a different error zone (since, again, most people don’t even know that’s a thing.)

Deprecations

If we deprecate inSameErrorZone and errorZone, the known uses of inSameErrorZone outside of the SDK are:

  • package:shelf: Should just not use it. (Used to guard a call to runZonedGuarded to only happen if in the root zone’s error zone. That seems like an unsafe behavior, since any error zone can call its parents uncaught error handler anyway, so having a different error zone does not guarantee that the error won’t reach the root zone.)
  • package:test_api tests use inSameErrorZone to check that some callbacks happen in the same error zone. They should just check that the appropriate handler is called on errors.

Also, Angular has a single use of .errorZone.

Mitigation

There may be a few cases of code that relies on a throw getting caught by the error-zone, and then not awaiting the non-completing future at all. If using the unawaited_futures lint, those futures will be wrapped in unawaited. The correct migration will likely be to change unawaited(uncompletedFuture) to uncompletedFuture.ignore().

lrhn avatar Jul 15 '22 14:07 lrhn

Most people don’t know this behavior even exists. They only discover it when something goes wrong.

FWIW I do know this behaviour exists and I still keep discovering it anew when something goes wrong, usually after hours of debugging.

Hixie avatar Jul 19 '22 19:07 Hixie

@Hixie @vsmenon @grouma are we okay with this breaking change?

itsjustkevin avatar Jul 21 '22 00:07 itsjustkevin

ACX does not make use of isSameErrorZone and errorZone. However, it does make use of a custom handleUncaughtError function that exposes them to users through a public API with a broadcast stream, i.e. ngZone.onUncaughtError. I'm not familiar with this code but it is used 25+ times. For this specific issue it's probably easiest to validate the change with a global presubmit.

grouma avatar Jul 21 '22 01:07 grouma

I don't object to this change in principle, but I am confident it will cause breaking on the Flutter side (and probably in Google too), and once we find out what that breakage is we should have a conversation about how to approach it, whether it needs a migration guide, etc.

Hixie avatar Jul 21 '22 02:07 Hixie

The biggest obstacle I've noticed so far is runZonedGuarded.

It's used in several places to catch all errors from a function, even those from a Future returned by that function, but not actually awaited inside the zone (in which case the future returned by runZonedGuarded won't seem to complete outside the zone).

I can simulate that, and probably should, but that's still repeating the "non-completing futures" problem, because it's not possible to return a future with a valid value post null safety. At least it's limited to runZonedGuarded and can be documented.

lrhn avatar Aug 17 '22 13:08 lrhn

Tentative LGTM - this sounds like a good change - assuming the breakage is manageable.

Have we tried flutter or google tests with the new behavior?

vsmenon avatar Aug 17 '22 15:08 vsmenon

Have CL: https://dart-review.googlesource.com/c/sdk/+/255400 Patchset 2 makes runZonedGuarded catch async errors and return a non-completing future in that case.

lrhn avatar Aug 17 '22 15:08 lrhn

@vsmenon does @lrhn recent comment answer your question here?

itsjustkevin avatar Sep 13 '22 16:09 itsjustkevin

@lrhn - are you still pushing on this? Did you assess the breakage here?

vsmenon avatar Nov 03 '22 17:11 vsmenon

Initial tests showed more breakage than I expected. I've put this on the back-burner for now.

lrhn avatar Dec 15 '22 11:12 lrhn

Thanks for calling this out @lrhn

itsjustkevin avatar Dec 15 '22 13:12 itsjustkevin

@lrhn this change is still showing a breakage. Do we still desire this breaking change?

itsjustkevin avatar Feb 03 '23 14:02 itsjustkevin

There has been no movement on this breaking change request.

Closing the requests, @lrhn feel free to reopen if this is still desired.

itsjustkevin avatar Sep 18 '23 14:09 itsjustkevin