ember-concurrency RFC: improved task linkage semantics

This RFC is not "complete" in that it doesn't yet offer a concrete proposal, but I wanted to put it out there and start collecting feedback

Summary

In ember-concurrency, when a parent task is cancelled, it cancels any child tasks it is currently yielding. This is desirable behavior we should continue to support, but the issue is that this linkage occurs at yield-time rather than at perform-time which causes a variety of subtle issues. The purpose of this RFC is to discuss alternatives that would solve a few ugly corner cases and bring e-c more in line with classic paradigms for process tree management.

Motivation

Here is the canonical example of a parent task performing a child:

parentTask: task(function * () {
  yield this.childTask.perform();
}),

The above example results in the parentTask being "linked" to the child in such a way that if the parent task is cancelled (explicitly or implicitly via .restartable()), the cancellation will cascade into the linked child task and cancel the child task.

What if there's a gap between when you perform a task and when you yield it? Consider the following

parentTask: task(function * () {
  // perform `childTask`, which starts running immediately
  let taskInstance = this.childTask.perform();

  // do some other async before you actually "await" the task instance...
  yield timeout(5000);

  // "await" the task instance (linkage occurs here)
  yield taskInstance;
}),

There is a "gotcha" with the above example: parentTask doesn't get linked to childTask until the yield taskInstance, so if parentTask is cancelled (or errors) before yielding the child task instance, the child task will continue to execute. The result of this is an inconsistent cleanup of the task/process tree depending on when you cancel, which leads to very subtle race-conditiony bugs.

The likelihood of encountering these cases increases if you start using (the e-c variants of) Promise.all/race/etc; often you need to do some setup of multiple child tasks before you're ready to await all of them, which expands the window of unlinked-ness.

If ember-concurrency made you decide, at perform-time, whether the child task should be linked to the parent task, a lot of these issues would go away. EC actually has an API for this, but people don't really know about it unless they get console.log()s alerting them about the dreaded self-cancel loop footgun.

parentTask: task(function * () {
  // perform childTask and link it to parentTask
  let taskInstance = this.childTask.linked().perform();

  // do some other async before you actually "await" the task instance...
  yield timeout(5000);

  // "await" the task instance
  yield taskInstance;
}),

There's also an API for marking the perform as unlinked() to prevent linkage when the task instance is yielded:

parentTask: task(function * () {
  yield this.childTask.unlinked().perform();
}),

If parentTask is cancelled in the above example, childTask will continue to run.

While it's nice that these escape hatches exist, ember-concurrency's defaults are non-ideal and can lead to surprisingly behavior. The purpose of this RFC is to propose changing the defaults. Specifically:

The only thing yielding a task instance should do is await it to run to completion. It should not cause linkage.
By default, performs should be unlinked (or perhaps we get rid of perform as a word and always make you choose performLinked or performUnlinked). In other words, for something to be linked, you must see reference to the word link in the code.

This would definitely be a breaking change, but I consider this to be one of the main issues that make me hesitant to propose that ember-concurrency be a part of Ember.js Core.

Detailed design

How to access parent task instance?

Any proposed API for making task linkage explicit has to solve the problem of "how do I get a reference to the parent task instance that I am linking to?"

Current `.linked()` approach

e.g. this.childTask.linked().perform()

One ugly aspect of this API is that it relies on asking ember-concurrency what the current running task instance is which is essentially a global variable that ember-concurrency maintains as it executes task instances. I'm not sure how big a deal this is, but it's definitely... weird. I'm not exactly sure how/whether this API would troll people, but here's an example of improper use:

parentTask: task(function * () {
  later(() => {
    this.childTask.linked().perform();
    // ERROR: "You can only call .linked() from within a task."
  }, 500);
}),

This would need to be rewritten as:

parentTask: task(function * () {
  let linkedHandle = this.childTask.linked()
  later(() => {
    linkedHandle.perform();
  }, 500);
}),

(Aside: it should be noted that if the parent task instance is cancelled before the timer elapses, linkedHandle.perform() will essentially no-op: it'll create/return a task instance that is immediately cancelled.)

Explicitly fetching/passing parent task instance

We could make it so that .linked() accepts a TaskInstance rather than looking up the global current running task instance. We could also make a .performLinked method that accepts the parent task instance as the first arg, e.g.:

parentTask: task(function * () {
  let currentInstance = /* TODO some API for fetching the running task instance */;
  this.childTask.linked(currentInstance).perform();
  // or
  this.childTask.performLinked(currentInstance);
}),

I don't have strong opinions as to whether we provided performLinked vs .linked(...).perform(), but there's still the issue of how to fetch the current running instance.

Note that encapsulated tasks don't have this issue since the value of this within the *perform() is the task instance:

parentTask: task({
  *perform() {
    this.owner.childTask.performLinked(this);
  }
}),

But for classic tasks, the only feasible API for fetching the parent task instance would be to yield a "special command" to e-c:

import { task, getCurrentTaskInstance } from 'ember-concurrency';

parentTask: task(function * () {
  let currentTaskInstance = yield getCurrentTaskInstance;
  this.childTask.performLinked(currentTaskInstance);
}),

This approach might surprise people, as generally speaking if you're using e-c, yield always means "await the following async thing". But we can also use yield to issue special commands to the e-c task runner, and getCurrentTaskInstance would be a special command that immediately returns a reference to the current running task instance.

CancellationTokens

Regardless of which approach we settle on for accessing the parent task instance to link to, we can think of the argument that you pass to .linked() as kind of CancellationToken, which is a common abstraction for managing task/process trees as well as a concept in various proposals to add cancellable promises to JavaScript.

Here is an example of how cancel tokens might look/work with async functions:

async function doStuff(arg1, arg2, cancelToken) {
  await uninterruptibleWork();
  cancelToken.cancelIfRequested();
  await interruptibleWork(cancelToken);
  await uninterruptibleWork();
  cancelToken.cancelIfRequested();
}

Basically, the caller is responsible for constructing/providing a cancelToken to doStuff. The caller can request cancellation at any time on the cancelToken, but the async function ultimately decides when it's a good time to cancel via called .cancelIfRequested() at discrete times. Whereas ember-concurrency tasks are assumed to be cancellable at any point, async functions are uncancellable by default, and you must "sprinkle in" cancellability where you need it using cancel tokens.

The story for "linkage" in ember-concurrency should probably factor in CancellationTokens. Perhaps it makes sense that .linked() / .performLinked() should also work with cancel tokens, such that if you do .linked(cancelToken).perform(), you have the same guarantees as if you'd linked to a parent task: the moment cancellation is requested on the token, the ember-concurrency task will immediately cancel.

Additionally, it should be possible to convert a task instance into a CancellationToken for interop with async functions, e.g.:

import { task, getCurrentTaskInstance } from 'ember-concurrency';

parentTask: task(function * () {
  let currentTaskInstance = yield getCurrentTaskInstance;
  this.doStuff('a', 'b', currentTaskInstance.asCancelToken());
}),
async doStuff(arg1, arg2, cancelToken) {
  // ...
}

Error handling?

How should we handle the following case:

parentTask: task(function * () {
  let me = yield getCurrentTaskInstance;

  // perform linked tasks but don't await them
  let a = this.childTask.performLinked(me);
  let b = this.childTask.performLinked(me);
  let c = this.childTask.performLinked(me);

  yield timeout(10000)
}),

What happens if one of the child task instances errors? Should parentTask immediately cancel? Should it only cancel if it is yielding a childTask?

I'd made a decision long ago that if you yield a child task that gets cancelled, the cancel bubbles up in a way that is un-catch-able (internally this is achieved by returning from the generator function rather than throwing). I made the change because at the time if you wanted to catch errors thrown from perform tasks, you had to check whether the error was a "TaskCancellation", e.g.

parentTask: task(function * () {
  try {
    yield this.childTask.perform();
  } catch(e) {
    if (isTaskCancellation(e)) {
      return; // disregard
    } else {
      // an actual error occurred; do stuff.
    }
  }
}),

With the change, you don't have to write the isTaskCancellation guards all over the place, but you also lose any way to treat an expected child task cancellation as an error. We should revisit this decision as we consider how to handle performLinked-but-not-awaited getting unexpectedly cancelled.

Perhaps we can draw some inspiration from Erlang:

The default behaviour when a process receives an exit signal with an exit reason other than normal, is to terminate and in turn emit exit signals with the same exit reason to its linked processes. An exit signal with reason normal is ignored.

A process can be set to trap exit signals by calling: process_flag(trap_exit, true)

When a process is trapping exits, it does not terminate when an exit signal is received. Instead, the signal is transformed into a message {'EXIT',FromPid,Reason}, which is put into the mailbox of the process, just like a regular message.

An exception to the above is if the exit reason is kill, that is if exit(Pid,kill) has been called. This unconditionally terminates the process, regardless of if it is trapping exit signals.

This makes sense for Erlang, but ember-concurrency task instances don't (currently) have any concept of mailbox / event queue or a receive loop, so this wouldn't really fall neatly into ember-concurrency's model. Perhaps we need some additional utilities to catch/report on the status of child tasks, but I'm not sure what?

How we teach this

TODO

Drawbacks

TODO

Alternatives

TODO

Unresolved questions

TODO

Sep 28 '18 16:09 machty

Implicit linking is the major issue so I'm for any solution that does away with that. I'd go so far as to say that even this.childTask.performLinked(yield getCurrentTaskInstance) is implicit linking in that what happens with errors/cancelation is not explicit

Previously we've had discussions around context and it's just dawned on me that the task instance could be the context. Your example triggered that though:

this.doStuff('a', 'b', currentTaskInstance.asCancelToken());

Given that it's the parent process that 'cares' about what the children are doing, does it make sense to reverse the semantics of the linking so the it's up to the parent to check for errors and pass on cancelation signals? This would remove a whole issue (implicitly linked children with no yield/ability to catch errors). A test for 'explicitness' could be "can these operations be done by outside a task definition?", just like you can with Promises

Exisiting behaviour in an explicit way might look like this:

import { task, getCurrentTaskInstance } from 'ember-concurrency';

parentTask: task(function * () {
  let childTaskInstance = this.childTask.perform();
  let currentTaskInstance = yield getCurrentTaskInstance;

  currentTaskInstance.propagateCancel(childTaskInstance);
  yield childTaskInstance;
})

Sep 29 '18 06:09 martletandco

Regarding getting a reference to the current task: could this not be curried the same way that an event is curried through actions?

So that if you do not pass any arguments in to .perform(), the first arg of the generator function would be the task instance. If you do something like .perform(someArg) then the generator’s second argument would be the task.

Perhaps then the most useful function to provide would be one that grabs the instance from the arguments.

Is that a possibility? If so, there is obviously already precedent inside Ember for curried arguments.

Assuming that, I personally like the explicit linking by passing the instance in to something like task.link(instance).perform().

It would possibly then allow you to link to things that have the required api, but may not necessarily be a task, or might not be the task at the call site.

Sep 29 '18 06:09 happycollision

I may have been misusing the terminology a bit there, but I hope the idea was clear.

Sep 29 '18 06:09 happycollision

This would remove a whole issue (implicitly linked children with no yield/ability to catch errors).

@martletandco can you elaborate? I didn't understand.

Also, I believe this was a typo

currentTaskInstance.propagateCancel(childTaskInstance);

but I don't understand your mental model enough to know which one should be currentTaskInstance.

Do we agree that linkage at least means: if a parent task is cancelled, its child tasks all get cancelled (or receive a cancel signal that some future API will allow them to response to)?

If so then we just need to decide what the behavior/API in the child->parent direction is.

One backwards-compatible approach we could start exploring would be at add "yield modifier" methods to TaskInstance that instruct e-c what to do when the child task finishes executing; the default behavior is "return completed values unwrapped, throw exceptions, and quietly propagate cancellation", but we could do something like:

let childTaskInstance = this.childTask.perform();

yield childTaskInstance.catchCancellation();
// or
yield childTaskInstance.asResult();
// in my mind `asResult` disables the error-throwing-propagation
// by wrapping ALL values returned from the task instance in a Result POJO,
// e.g. when the task succeeds it returns:
// { error: false, success: true, value: ExceptionObj }
// when it fails:
// { error: true, reason: ExceptionObj, success: false }

Note that "yield modifiers" simply return special command objects that e-c's task runner can make sense of; they don't modify/mutate the task instance's config or anything.

Sep 29 '18 10:09 machty

then the generator’s second argument would be the task.

@happycollision you mean "would be the task instance" right?

I don't think we'd be able to simply enable this for all tasks as it would break anyone's app who's using splatting (e.g. ...args).

That said, maybe there's an approach where we use task modifiers to denote that the task should receive extra data as args:

foo: task(function * (currentTaskInstance, ...args) {
  yield this.otherTask.linked(currentTaskInstance);
}).gimmeTaskInstance(),

I'm starting to think that there's not a lot of value in requiring the user to obtain the current task instance simply for the purpose of passing it to .linked() on child tasks; as much as the current approach of deriving the current task instance automatically is a bit magical, I'm having a hard time thinking of when that behavior would/could actually cause problems.

So I'm thinking .linked() we continue to default to the current running task instance (without requiring you to obtain it in some way), but we should allow passing in a CancellationToken, or I guess another task instance? Would something like the following ever be useful:

let a = this.childTask.unlinked().perform();
let b = this.childTask.linked(a).perform();
let c = this.childTask.linked(b).perform();

// if `a` ever gets cancelled, b and c will get cancelled

Sep 29 '18 10:09 machty

I read up a bit more on Erlang semantics:

Erlang process linkage is bidirectional; regardless of which process is parent/child, if they linked and one of them crashes/terminates, the other is sent an exit signal which will automatically cause termination unless you explicitly trap the event.
Erlang also supports "monitoring", which is unidirectional and doesn't by default cause a cascading exit.

It's good to keep in mind the distinctions between e-c's concept of linked-ness vs Erlang's, and maybe we'll discover that we should be using a different name than "linked".

Also I'm realizing more and more that e-c tasks are (and should remain) a different abstraction than processes. Tasks are (ideally) short-lived self-contained units of work that have a clear beginning, middle, and end, whereas processes are intended to be long lived and are built around the concept of a message queue and a loop. I'd like to explore adding processes to e-c to see if there's any value; in my apps I can think of a few cases of tasks that have infinite while(true) loops in them that have all the baggage of weird linkage/cancellation semantics, and I think @martletandco has similar use cases.

Sep 29 '18 12:09 machty

I'm going to try to avoid the word 'linking' and focus on these two (three?) parts:

Capturing child task errors 1b. (?) Capturing child task cancellation
Signalling child tasks to cancel

Mental model is that parents are responsible for capturing output/errors and signalling cancellation (so more like Erlang's monitoring)

Capturing child task errors (including cancellation) seems to be more straightforward. We can handle the error ourselves using a try-catch around a yield, or the proposed yield modifiers (yield childTaskInstance.asResult();) to bubble the error to the parent task's runner

What I'm proposing is that a non-modified child-task-yield does not automatically mean the child task will cancel when the parent does. If we have another way to signal a child to cancel when the parent does then we swap a subtle bug for a hopefully less subtle one. That's what I was trying to get at with propagateCancel (and why the method is on the parent task)

The issue I see this separation solving is the awkward 'where does this error bubble to?' problem that comes from having 'depending on child task result' coupled with 'cancel all tasks I depend on when I am cancelled'

I'm all for setting a tight focus on what tasks are for/can do and introducing another concept for processes

Sep 29 '18 14:09 martletandco

@martletandco So in other words you're proposing that

yield this.childTask.perform();

should ideally behave like:

yield this.childTask.unlinked().perform();

(I'm using the word "unlinked" before simply because the above .unlinked() usage is actual usable ember-concurrency API today.)

So in other words, ember-concurency's default should be that tasks are performed in an unlinked manner, and that yielding an unlinked task does not auto-link them. I believe this is the same as how you put it: "a non-modified child-task-yield does not automatically mean the child task will cancel when the parent does"

I think I'm in agreement with you that this is the ideal default behavior (but we should force ourselves to think through the example or demonstrate actual code from our apps that we would need to fix to account for such a change), but we also need to discuss how/whether to roll out this change when e-c has so many active users (including other addons). If we made this change, suddenly e-c would stop cleaning up / cancelling child tasks in a way that would almost certainly cause breakage.

I discussed with @lukemelia and @ryanto the other night that maybe there needs to be a big breaking 1.0.0 release, and that there will be pain associated with it but it'll hopefully be short-lived. But how we roll out this change should be as important to work out as what the change/defaults should be.

Sep 29 '18 14:09 machty

happycollision you mean "would be the task instance" right?

Yes.

I like the gimmeTaskInstance() modifier idea. Here's why: It allows you to pass the instance along to something else without question. Every yield is an opportunity for the next line to be cancelled before running.

So this in the "yield to get the instance" scenario...

parentTask: task(function * () {
  let currentTaskInstance = yield getCurrentTaskInstance;
  this.doStuff('a', 'b', currentTaskInstance.asCancelToken());
  // more stuff below...
}),

There is zero guarantee that this.doStuff will be called. Whereas in the "gimme" scenario:

foo: task(function * (currentTaskInstance, ...args) {
  this.doStuff('a', 'b', currentTaskInstance.asCancelToken());
  // more stuff below
}).gimmeTaskInstance(),

As long as the task is not called as a dropped task to begin with, this.doStuff is definitely going to run. (Right?)

I know for my part, if there are things that are done inside my task that I always want to run, I do try and write the task in such a way as to get that all done before the first yield.

However, I do like the idea of teaching that yielding can be used as a way of getting certain objects or asking e-c to do things on your behalf. So, as with anything else, this is a tradeoff. I suppose if you'd rather support the yield getCurrentInstance version of things, when folks really need the instance before the first yield, they can write an encapsulated task.

Sep 30 '18 15:09 happycollision

@happycollision Your explanation makes sense, but makes me think we should consider improved/simpler mental models to avoid requiring that you think through every possible cancellation corner case. In other words, it's unfortunate that seasoned e-c users must have a fear of yield as a possible cancellation point, as it clutters thinking, hampers refactoring, and leads to difficult-to-test regressions whenever you make modifications to a task that at one point was delicately designed to avoid certain sequences of yields that might feasibly lead to an invalid state.

I don't know what that better mental model is but if we're thinking in terms of 1.0.0 release goals, I'd like to address any common cases where e-c requires developers to have eternal vigilance to avoid race conditions.

It's pretty unlikely that you would encounter the situation you described in actual production code, but I believe something like this would reproduce it:

someTask: task(function * () {
  this.restartableChildTask.perform();
  this.restartableChildTask.perform();
  this.restartableChildTask.perform();
}),
restartableChildTask: task(function * () {
  let currentTaskInstance = yield getCurrentTaskInstance;
  yield doImportantWork(currentTaskInstance.asCancellationToken());
}).restartable(),

I believe this would result in all restartableChildTask instances, except for the last, to fail to doImportantWork; internally, the getCurrentTaskInstance would immediately produce a value, but e-c wouldn't continue executing until the actions run loop queue, which would happen after the other this.restartableChildTask.perform()s.

This is a rare and contrived situation that probably points to something else that "off" about your code (e.g. calling restartable tasks multiple times in a row and depending of the e-c behavior of synchronously executing the first chunk of a task). But aside from this very unusual corner case, I believe there's a more common need for some ability to mark regions of a task as uncancellable/uninterruptible, which would solve this use case AND other use cases of actual async (and not the short-term run loop "async" that completely resolves within one javascript event loop tick). Maybe we add a task modifier called .uninterruptible() which automatically prepends (pre-curries) the task instance to the arguments so that it can be used to signal when the task instance becomes interruptible again.

someTask: task(function * () {
  this.restartableChildTask.perform();
  this.restartableChildTask.perform();
  this.restartableChildTask.perform();
}),
restartableChildTask: task(function * (currentTaskInstance) {
  yield doImportantWork(currentTaskInstance.asCancellationToken());
}).uninterruptible(),

It might be a weird unexpected side effect that .uninterruptible() would start providing the task instance as the first arg, but maybe not? I dunno.

That all said...

For the purposes of this RFC, I'm still thinking it's overkill that we require passing the parent task instance to childTask.linked().perform() when we can automatically derive it with 99.999% confidence. The only time it seems useful would be for async function interop.

Sep 30 '18 16:09 machty

Regarding performLinked and performUnlinked:

I love the current simplicity of perform. However, when I was learning e-c, I must admit that your choice of using cancelAll as the method to call on a Task, helped me to understand the system very clearly. So as much as I would hate to see the api get a little less "pretty", I do think that it would help people understand what is going on, and to opt-in to the exact behavior they want.

I also would assume the vast majority of "perform"s would be "Linked", though. Unless your thought is that perform exists outside of the generator function, but only performLinked and performUnlinked exist inside the generator function.

Sep 30 '18 16:09 happycollision

For the purposes of this RFC

Right right right. I wasn't thinking about this as 1.0.0 properly. If you want to add either .gimmeTaskInstance() or yield getCurrentTaskInstance, those are easily minor releases and don't have to be considered now. Simply deciding that linked() needs no argument is what is needed for 1.0.0 now. Got it.

Sep 30 '18 16:09 happycollision

Communication is going to be the biggest challenge. Would moving to performUnlinked and performLinked mean that deprecation warning could be added? Does linked/unlink have to come before perform? Can you link a task instance after it's been started/queued?

Not sure if this is off topic, I'd really like to see what an e-c process model might be. I keep wondering if Tasks are being asked to do too much and that is part of the issue. I know that I use them in services a bunch where it's a much more long lived deal and feed values in and out would make life easier. Perhaps an extended model of encapsulated tasks which a channel or stream concept. Another idea might be to have some more hooks into the task runner so you can have control over task grouping and the like

Oct 01 '18 02:10 martletandco

@martletandco It's off-topic enough to save discussion for another RFC (or the Discord channel) but here are some sketches of e-c process APIs

I'm not sure what the use cases are for linkage after perform but we can always add it in later if we need it. I guess we can start to deprecate unqualified .perform()s over time in favor of .performLinked() or .performUnlinked().

Oct 01 '18 02:10 machty

Usecase for cancelation token/signal in child task: Parent task performs a child task that starts a router transition, if the parent task is cancelled that transition should be aborted. (This is almost the component-self-transition-abort issue but intended)

This brings up a point around how to capture an error or cancelation in the current task (self-catch), if a child task or async function might get a cancelation token from a parent, why doesn't the parent get one from the task runner?

parentTask: task(function * () {
  let token = yield getCanelToken;
  const transition = this.router.transitionTo('place');
  token.onCancel(transition.abort);
  yield transition;
}),

Oct 01 '18 02:10 martletandco

RFC: improved task linkage semantics

Summary

Motivation

Detailed design

How to access parent task instance?

Current .linked() approach

Explicitly fetching/passing parent task instance

CancellationTokens

Error handling?

How we teach this

Drawbacks

Alternatives

Unresolved questions

That all said...

Current `.linked()` approach