microvium
microvium copied to clipboard
Async/await design proposal
Async/await in Microvium
I'm not "promising" that async/await will be implemented in Microvium, but it's on my radar. This GitHub ticket is a consolidation of my personal design notes on the feature, in case anyone wants to review it and offer suggestions or improvements, or just for those interested in the process. Feel free to add comments to this ticket with any questions, corrections, clarifications or suggestions.
In this first message, I'll cover the motivation and requirements, and then in the next message, I'll cover the proposed design.
Why async/await?
- I believe that Microivum will be useful in IoT solutions, where a device interacts with a server. Async/await is a very ergonomic way of expressing control flow logic involving a non-local target.
- Similarly I've expressed that callback-based-async is a good reason to use Microvium in an otherwise-C environment, for expressing sequences of non-blocking behavior that might otherwise be expressed in C as complicated state machines. Obviously, async-await is even better than callbacks!
Objectives
- Be able to write
async
functions thatawait
the result of otherasync
functions. - Provide a way of creating something that an
async
function canawait
that isn't anotherasync
function. E.g. a thenable or promise. Ideally, this should be the syntaxnew Promise(...)
since this is the typical way to do this in JS. - As always, the behavior of user code should be the same in Microvium as in V8.
- As always, size is a major priority. First RAM size, and secondly the engine's compile size.
Gotchas to watch out for
What should the following print (in particular, in what order)?
useThenable('a');
usePromise('b');
useAsync('c');
async function useThenable(name) {
console.log(`Awaiting thenable ${name}`)
await { then: resolve => resolve() };
console.log(`Finished awaiting thenable ${name}`)
}
async function usePromise(name) {
console.log(`Awaiting promise ${name}`)
await new Promise(resolve => resolve());
console.log(`Finished awaiting promise ${name}`)
}
async function useAsync(name) {
console.log(`Awaiting async func ${name}`)
await (async () => {})();
console.log(`Finished awaiting async func ${name}`)
}
The answer is:
Awaiting thenable a
Awaiting promise b
Awaiting async func c
Finished awaiting promise b
Finished awaiting async func c
Finished awaiting thenable a
The key gotcha here is that the thenable
completes last. Why? Because a thenable is not a real Promise
, so the ECMA-262 spec says that it must be wrapped in a promise. I think the extra wrapping layer causes it to be posted to the promise job queue twice, so the callback is run last.
This also means that thenables are necessarily less efficient than promises, because thenables must be wrapped in promises anyway (so you land up having both the thenable and the promise in RAM) and they must have this behavior where they land up on the promise job queue twice instead of once.
Here's another gotcha:
usePromise('a');
useAsync('b');
useThenable('c');
function useThenable(name) {
console.log(`Waiting for thenable ${name}`)
waitFor({ then: resolve => resolve(name) });
}
async function usePromise(name) {
console.log(`Waiting for promise ${name}`)
waitFor(new Promise(resolve => resolve(name)));
}
async function useAsync(name) {
console.log(`Waiting for async func ${name}`)
waitFor((async () => name)());
}
function waitFor(thing) {
thing.then(result => {
console.log(`Finished waiting for ${result}`)
})
}
This prints:
Waiting for promise a
Waiting for async func b
Waiting for thenable c
Finished waiting for c
Finished waiting for a
Finished waiting for b
The key thing here is that the thenable c
completes first. Why? Because Promises always evaluate their callback asynchronously in the promise job queue, while an arbitrary thenable does not.
As stated in the objectives, if this code runs in Microvium then it must produce the same output. We have the choice either to forbid/remove certain features or to implement them correctly.
Proposed design
Note: There is a babel plugin to transform async to promises. I considered using that because it's an off-the-shelf way of supporting async with very little development cost on my end, but when I ran it on some test code, it looks like the output is horrendously complicated and verbose, especially in the case of things like loops. I don't think I could ever stomach using async if it were that inefficient.
CPS
The core of this design is based on continuation-passing style (CPS), where an awaiting caller passes in a callback that is a continuation to execute when the async callee completes. This is fundamentally different from typical promises where the caller subscribes a callback after the callee returns. I believe that the continuation-passing style is much more memory efficient, but it will only work in a subset of scenarios (more on that later)
For example, if async function foo
calls await bar()
, where bar
is also an async function, then bar
must remember that foo
is awaiting it so that it knows where to return to.
I'm proposing the following memory structure for such a scenario:
In this design, an idle (awaiting) async function consumes 8 bytes of memory (3 slots + a 2-byte allocation header), plus 2 bytes for each variable in the async function.
-
scope
- a pointer to the outer variable environment of the async function, so it can access its lexical closure. This slot could also be overloaded to hold the result of the async operation since the result is only available when the scope is no longer needed. -
resumeAt
- a function pointer that points to the bytecode corresponding to the current resume point in the async function. Practically, this could be a lightweight trampoline function that jumps into the correct location in the main async code. TheresumeAt
slot could also be overloaded to point to code that invokes theawaiter
when the async function reaches its terminal state, so that we don't need to allocate a separate closure to put on the promise job queue. -
awaiter
is a function pointer that points to the callback/continuation of the async caller.
To do this, I propose a modification to the current 6-byte closure structure. Currently, a closure is a 6-byte tuple that pairs a function pointer with an environment pointer. In this proposed change, a closure is also its own environment, rather than holding a pointer to its environment. It still maintains the scope
pointer, but this will be a pointer to its parent environment. In other words, I'm proposing the unification of closures and environments, so that environments are closures and closures are environments. This has 2 advantages:
- At least one closure in an environment can be allocated with the environment itself, saving on the cost of an allocation. This reduces the cost of the closure+environment from 10 bytes to 6 bytes, which is significant.
- Since the closure is its own environment, a closure can use normal variable-accessing opcodes to mutate the closure itself. In particular, it can mutate the function pointer in the closure, as if it's a local variable. This is important for this async proposal -- it allows the async function code to mutate its
resumeAt
pointer according to its current state.
When a closure is called, the function in the second slot (in the diagram this is the resumeAt
slot) will be invoked, using the whole closure structure as the current closure scope. In the case where multiple closures need to share an environment, they can each have their parent scope
pointer point to the shared environment.
Awaiter
In this proposed async design, a call like await foo()
will pass a CPS callback, which I call the awaiter
. An awaiter is a callback function (which is likely to be a closure that represents the continuation of the caller operation) that the async callee will invoke when it terminates (resolves or rejects).
awaiter(isSuccess, result)
The awaiter
call conforms to the following contract:
- The first parameter is an
isSuccess
flag that istrue
if callee resolves andfalse
if the callee rejects. - The second parameter is the result if
isSuccess
is true or an error ifisSuccess
is false. - The
awaiter
must be called on the promise job queue, or it may be called synchronously if the promise job queue is empty. - The
awaiter
must not be called more than once.
In the case of one async
function foo
calling another async function bar
(like in the diagram earlier), these invariants can be trivially guaranteed. The callee bar
is under the control of the compiler and so it can invoke the awaiter
only on the job queue, and it will only invoke it once (because bar
only completes once).
Compatibility with promises
The above proposed CPS "core" will only work in the specific scenario where one async function is calling and awaiting another async function, since both the caller and callee are under the control of the compiler. I believe this to be the most common case, and something worth optimizing for. But we need to also support the case where we await things that are not other async functions, and similarly need to support the use of calling an async function but not awaiting it (or at least not immediately).
The compiler will recognize the syntactic form await bar()
, which is a function call expression with an await
on the result. Roughly speaking, this syntactic form will compile to something which passes the awaiter continuation from the caller to the callee (possibly by a new VM register). If the callee is itself an async function then it will "take" the continuation and store it in its awaiter
slot for use later.
This is the ideal/optimal case, but there are other cases to deal with:
-
x = bar()
(i.e. not awaited) wherebar
is an async function. -
await bar
wherebar
is a promise -
await bar
wherebar
is not a promise -
await bar()
wherebar
is not an async function.
In case #1
, bar
needs to return a promise, since it's an async function and now the return value is observable, and CPS can't be used directly. For that case, bar
will "see" that there is no awaiter
and so it will create a Promise
object and synthesize its own awaiter which resolves this promise. Then bar
must return this Promise
. The Promise acts as a protocol adapter.
In case #2
, bar
needs to have its awaiter
added to the __subscriber
list of the promise. Internally, the subscriber list will follow the same contract guarantees as CPS, so it can be added directly. Or if the promise is already settled, the awaiter
can be enqueued or invoked directly.
Case #3
will be quite unusual I think, but it would be easy to support. The non-promise object can be wrapped in a promise that is in a resolved state before following the logic for #2
.
In case #3
, foo
will pass its awaiter
to bar
who will not use it. So foo
must then "notice" that the awaiter
callback is not consumed by the callee (because the callee is not CPS compatible), and so fall back to the logic of #2
or #3
on the result of the call to bar
.
I've elided a lot of the details here, because I haven't worked through them yet. I'm thinking that there will be probably be a new virtual register named awaiter
which is used as a side channel for the caller and callee to negotiate the CPS protocol. Ideally, this register may also be exposed to the C host, so that native async functions can participate in CPS if they support it, rather than relying on inefficient promises at the bottom.
What does resumeAt
point to?
I propose that async functions are basically compiled down to a single closure but with multiple entry points. The resumeAt
pointer will be a "function pointer" that is just a trampoline to invoke the primary async closure. There are no particular checks in Microvium that restrict the Jump
instructions from jumping only within their local function bytecode, so these resumeAt
functions could just be single-instruction functions which just jump to the right location in the primary async function.
How much RAM does this solution use?
Each idle async function uses 8 bytes, plus 2 bytes for each variable in the function.
At the boundary where async functions are interfacing with non-async function, promises will be needed. A promise is an object with the following properties:
-
__status
to indicate whether the promise is already settled or not, and what state it settled into -
__callback
that points to a CPS callback or a list of CPS callbacks. - When resolved, it will have a
__value
. I think the__value
and the__callback
properties could be the same overloaded property since they are used at different phases of the promise. So there's really only 2 properties.
An object with 2 properties is 14 bytes, so this brings the cost of async functions to 22 bytes at this boundary scenario.
In the situation where the user creates a promise using new Promise((resolve, reject) => {...})
, which is typical at the leaves of the async hierarchy, this will be a 14 byte promise object with 2 additional closures: resolve
and reject
. With the new closure design, the reject
closure could "inherit" from the resolve closure so that only two allocations are involved. In this case, the reject closure would be 6 bytes and the resolve closure could also be 6 bytes (if it reused the scope slot as the promise slot) -- a total of 12 bytes. In addition to the 14-byte promise object itself, this would be 26 bytes.
In summary:
- In the hopefully-common case of
await asyncFunc()
, each awaited function is a single 8-byte allocation. - In the less-common case where the result of the async function is assigned to a temporary variable or used in non-async code, the awaited function is 20 bytes.
- In the leaf scenario where a user constructs their own promise, it will cost 26 bytes per promise.
- For cases where the host implements the CPS protocol directly, the cost is essentially free in terms of managed memory. Or rather, the host will need to use its own memory, to hold a handle that points to the
awaiter
callback.
Version 8.0.0 brings async-await. I'll be doing some write-ups on how it works. It's roughly what I proposed above but slightly more efficient, only requiring 6 bytes of idle memory in the best case.