stack-switching Support for long-running computations

A given computation may undergo multiple suspensions/resumptions in its lifetime. For example, in the case of use async/await to access I/O, a given computation is likely to be suspended and resumed multiple times during its lifetime.

On the other hand, keeping track of the identity of the suspending computation may be important, for example, in scheduling strategies that attempt to be fair across multiple tasks.

This appears to require a form of identity for computation that is not limited to a single suspension/resumption.

Jan 12 '22 22:01 fgmccabe

This can still be efficiently modeled via a single (observably) immutable suspension paired with a separate identity that'd likely be needed either way, like storing a stack_index inside the task or coroutine state, so suspensions can be resumed correctly. Even if the suspension was mutable, you'd still need the pointer to resume it.

A mutable stack that can be entered multiple times would be easier to implement efficiently and optimize for, though (and most languages use such a model, actually).

Jan 14 '22 01:01 dead-claudia

The issue with that approach is recovering the identity of the task from a suspended computation; in particular a continuation has no other information accessible from outside than the continuation itself.

Jan 14 '22 18:01 fgmccabe

It'd be tracked separately through an external global or similar, similar to how most interpreters do it normally even today. Not sure what's special about it - the only difference is you're storing a stack reference instead of the actual stack data + stack pointer + instruction pointer + whatever system registers need stored.

Jan 15 '22 01:01 dead-claudia

I understand that. But you have to pick the identity up when you suspend. At best that means storing it in a global when you resume the continuation. But there may be a mixture of things going on; so you have to be certain that the identity you get from your global is the right one. As I said, fragile.

Jan 15 '22 01:01 fgmccabe

It is fragile in the sense it's easy to screw up, but it's also pretty standard faire for compiler writers for languages with coroutines that compile to native (either statically or via JIT). Additionally, JIT compiler writers in general have to implement most of the coordinating logic anyways (even though they don't need to persist any stack data) so they can properly jump between interpreter and JIT, so this really isn't much for them.

Jan 15 '22 10:01 dead-claudia

Another go: In many languages/applications, the natural high-level conceptualization revolves around tasks rather than continuations.

However, in order to efficiently implement continuations, specifically delimited continuations, the stack is often split at a prompt bracket and collected (reified) when the suspension causes the continuation to be constructed. This stack is effectively a resource that is constructed under the covers.

If a task involves multiple suspensions, then the same stack resource is re-used (assuming that the prompt bracket remains the same) each time the computation is suspended. In effect, at each suspension, a snapshot of the stack resource is created as the basis of the continuation entity.

So, the picture is one of resources -> snapshots -> resources. One is strongly tempted to skip the middle man.

In addition, some desirable properties (such as opt-in safety) necessarily require inspecting the stack resource since there is not enough information in the continuation qua continuation to ensure them.

Jan 19 '22 17:01 fgmccabe

@fgmccabe If it helps, this is my mental model of how tasks work and my understanding of how potentially-awaiting tasks are implemented.

There are two main ways of implementing tasks (and cooperative coroutines in general): state machines and instruction pointer tracking.

State machines are common for stackless tasks as it's relatively simple to implement (most engines initially implemented JS generators this way) when the stack size is statically known in advance. This only requires segment of memory and a table branch to implement - no special bytecode support is needed.
Instruction pointer tracking is mostly used for high-performance scenarios (it's one load + jump instead of two) and for stackful coroutines (like Lua's) where the stack size isn't known in advance. This requires either a special mechanism to delimit continuations or the ability to control the underlying instruction pointer.

In a sense, the latter is merely the former broken down into its smallest parts. One could view each instruction position as a state, and the instruction pointer as the offset corresponding to that state.

The two are similar in theory, but also significantly different in implementation. Here's a quick overview of each of them at a high level (excluding obvious optimizations and such).

State machine

Examples: Kotlin's suspend functions (due to the JVM's lack of native coroutine/task support), V8's original generator implementation

Perform create:
- Create a new task state machine with space for the task's stack data, and set its state to the intial state for the task.
- Call the task's body with the task state machine as its state context - this returns either an Await(value) or a Return(value)
- If the result is an Await, subscribe to its value with a function that when called, performs a resume with the task state machine and value
- If the result is a Return, enqueue a successful fulfillment as appropriate
Perform resume with task state machine and value:
- Call the task's body with the task state machine as its state context - this returns either an Await(value) or a Return(value)
- If the result is an Await, subscribe to its value with a function that when called, performs a resume with the task state machine and value
- If the result is a Return, enqueue a successful fulfillment as appropriate
Perform yield with value
- Update outer task state machine's data with whatever needs to persist to the next state
- Set the outer task state machine's state to the state after this yield
- Return from the originating task body

Instruction pointer

Examples: WebKit's generator implementation (search for "High-Throughput Generators"), Lua's coroutines

Perform create:
- Create a new stack
- Save the current stack frame
- Switch to the newly created stack and set the first entry to the old stack pointer
- Call the task as a function - this returns either an Await(continuation, value) or a Return(value)
  - The stack pointer, instruction pointer, and stack itself at the time of suspension are saved to the continuation in Await results
- Restore the old stack pointer + stack frame
- If the result is an Await, subscribe to its value with a function that when called, performs a resume with the continuation and value
- If the result is a Return, enqueue a successful fulfillment as appropriate
Perform resume with previous continuation and value:
- Save the current stack frame
- Switch to the previous continuation's stack and set the first entry to the old stack pointer
- Call the task as a function - this returns either an Await(next continuation, value) or a Return(value)
- Restore the old stack pointer + stack frame
- If the result is an Await, subscribe to its value with a function that when called, performs a resume with the next continuation and value
- If the result is a Return, enqueue a successful fulfillment as appropriate
Perform yield with value
- Create a new continuation with the current stack pointer, instruction pointer, and stack data
- Return from the originating task (non-locally for stackful tasks, locally from stackless ones)

Jan 19 '22 20:01 dead-claudia

This makes me think the instruction pointer part could be addressed with the following:

A continuation_ref type that encapsulates a stack offset, stack pointer, and value. This would be nullable.
A continuation.start typeidx :: (...args, func_ref) -> (...results, continuation_ref) instruction that enters func_ref and returns a nullable continuation. If null, it's a return value, and if non-null, it yielded.
A continuation.resume typeidx :: (...args, continuation_ref) -> (...results, continuation_ref) instruction that resumes from a non-null continuation ref and returns a result.
A continuation.yield typeidx :: (...results) -> (...args) instruction that yields and resumes as desired.
- ...results corresponds to ...results from coroutine.start and coroutine.resume
- ...args corresponds to ...args from coroutine.resume

Thoughts?

Jan 19 '22 21:01 dead-claudia