async/await/suspend/resume
This is a sub-task of #89.
I'm willing to try implementing async/await/suspend/resume for stage2 as i require them for a project i'm working on.
The issue is that i dont really know where to start.
It seems like AstGen supports them.
Sema doesn't (calls to failWithUseOfAsync, so here I know where I need to work) so I'll start with that.
The AIR only have async_call, async_call_alloc, suspend_begin, and suspend_end instructions. By looking at stage1 it seems like the await/suspend/resume instructions are missing. Should I try to just add the instructions and replace calls to failWithUseOfAsync by looking at how stage1 implements them?
Futhermore, is async implemented in a similar way in stage2 as stage1? (basically stage1 being a good representation of how stage2 implements and uses frames, async calls, suspends, resumes, etc)
Edit: I've been using the stage2-async branch, assuming that's where the async development is being done.
I've been following the WASI development and it seems to be going great! That being said, I am currently working on a new project and I am using some specific stage2 features. I am not using async yet, but I'd love to introduce it soon. Can you provide a very rough estimate of when this is planned to be merged in master? It is just for general planning (no pressure). Cheers!
Looking forward to this.
https://ziglang.org/news/0.11.0-postponed-again/
Can I use async primitives in 0.11.0 or will they be hard errors? I'd like to use them for interrupt handling.
@andrewrk Could you give a realistic estimate (not in releases, but in time) when asynchronous functions will return to the language?
0.12
@maxzhao 0.12 is kinda in one more year (or maybe half-year in best case). It will come much earlier into master I suppose. That's why I asked for estimate in time, not in releases.
I suppose then it would be useful to update the docs? They still state Async functions are being temporarily regressed and will be [restored before Zig 0.11.0 is tagged](https://github.com/ziglang/zig/issues/6025). in the Async Functions section.
I was querying some implementation details of this with @andrewrk and he suggested moving the conversation here so it's recorded on the issue tracker. For reference, my questions are as follows:
Q1. Frames
What exactly does the @Frame of a function store? I've heard Andrew mention before that it contains all values spilled across suspend points, as well as a value indicating the index of the last suspend, with the idea being that the function effectively becomes a switch on this index to continue the function where we left off (accessing spilled locals also from the struct). However, this doesn't align with the idea that async can be used to avoid stack overflow in the case of recursion (see #1260 and #1639), for which we want all stack values to end up in the @Frame. As far as I'm aware, LLVM doesn't provide us with a way to move the stack allocations it creates during codegen, nor to even know how big they are. So, short of adjusting the stack pointer in a platform-dependent way (which still wouldn't explain how we determine the frame's size) how is this goal achieved?
Q2. Function Pointers
This question is about the language specification. When taking a pointer to a function, we may not yet know whether or not it is async (as this is determined after semantic analysis). More to the point, even if we did know this, it's not a part of the function's type, so we can't differentiate between an async and non-async function when calling a runtime-known function pointer. Since the code generated for calling async and non-async functions is necessarily different, how is this handled? As far as I can tell from looking through code from the old C++ compiler implementation, this just wasn't handled at all before, and you'd probably just get a crash if you attempted to call an async function through a pointer. The best solution I can think of to this is just that you can't take a reference to an async function (a compile error is emitted if you try) - we could emit these errors retroactively after determining a function is async - but perhaps there's a better solution I'm not seeing.
I agree with myself that @Frame stores all values spilled across suspend points. Safe recursion (#1006) was never accomplished in practice, and it's still something I want to pursue but I don't think it needs to be a requirement of landing async functions back into the compiler. I think I probably just didn't consider that fact about everything needing to be spilled. Or maybe I did and figured out some trick to make it work, like detecting that the function is recursive and forcing spillage. That might be a bad idea though. Still, there are other ways to tackle safe recursion, and that can be solved separately.
As for your second question, you hit the nail on the head. The compiler did not handle that case at all, and it caused all kinds of terrible problems. Async functions were certainly of experimental quality. I suggest to make it a compile error for now and we can go from there.
Thanks! Some minor follow-up questions:
-
Aside from spilled locals, what needs to be stored in an async frame? Off the top of my head, I can think of return value, last suspend index, and return address - but for the latter we also need to know whether the function has been
awaited yet, so I suppose make the return address null until theawait? That would makeawaiteffectively be "is return value already populated - if yes, immediately return it - otherwise, set the return address and suspend". I guess rather than a return address, it'd be a pair of the caller's function pointer and frame pointer? -
What's the deal with
@Frameof a non-async function? Since we're not (yet) dealing with safe recursion, that value is actually pretty meaningless. Should it just be 0 bits for now? More generally, what does the language spec say about usingasyncfor non-async functions? Iffooisn't async, is it valid to doasync foo()(and it effectively becomes a synchronous call, where I guess the frame just wraps the immediately-known return value)? Or is this a compile error? I never used async in stage1 so am not familiar with some of these semantics. -
What's the deal with
std.builtin.CallingConvention.Async? Why does it exist? The doc comment on it suggests it does something very odd which I don't see a point to (makes a callfoo()act likeasync foo()), so I feel like I'm missing something. If that is what it does, this would require standard function call syntax to consume a result location (the frame pointer), which would regress the test case discussed in e661900.
Aside from spilled locals, what needs to be stored in an async frame?
I do recommend to follow the codegen.cpp source here. There wasn't anything unused there. Off the top of my head it's:
- switch prong index
- return value
- pointer to return value. this is needed for atomicrmw when await races with return in different threads
- error return trace
- spilled locals
I think there was a function near the bottom of analyze.cpp that spelled it out pretty linearly. resolve_async_frame or something along those lines.
The first part of the @Frame(foo) type has the same memory layout as anyframe->T which can be awaited without knowing exactly which function is being awaited, and the first part of that has the same memory layout as anyframe, which can be used with resume, without knowing the return type of the function being resumed.
I suppose make the return address null until the await?
You should look at how codegen.cpp lowered return and await, there is an atomicrmw that makes this stuff threadsafe.
What's the deal with
@Frameof a non-async function?
The idea is that you can use async and await keywords on functions that do not have any suspend points, and they still work correctly even though the function has already fully returned at the async site.
Recommend to look at test/behavior/async_fn.zig.
What's the deal with
std.builtin.CallingConvention.Async?
This is needed to differentiate runtime-known function pointers to async functions vs non-async functions. Functions which have nonzero suspend points require the async calling convention. Functions with zero suspend points may be lowered with many different calling conventions, including the async calling convention.
Every function call could potentially be an async function call, and the compiler does not find this out until the function call graph is fully analyzed. Thus every function call must have access to a result location. e66190025ffab39527da601980b7e3211069b6f5 basically must be reverted in order to implement async functions.
This is needed to differentiate runtime-known function pointers to async functions vs non-async functions.
I'm a little confused on what you mean here, since to my initial Q2 you said that async function pointers weren't really a thing that was handled at all in stage1, and as you say, we don't know until after semantic analysis whether a function we took a reference to was async or not. I understand that the actual lowering of an async function must use a consistent calling convention (so that it can be resumed by e.g. an event loop after being type-erased), but why is this concept relevant to the frontend?
[...] Thus every function call must have access to a result location.
Hm, but a plain call foo() to an async function doesn't actually use its result location - the frame is (I presume) implicitly allocated into the caller's frame. So, surely we would only need a result location if explicitly providing the frame pointer via async foo() syntax? I think I'm missing something here. Perhaps it'll become clearer to me after reading the stage1 logic (I'm just about to pull it up).
It was possible to use async function pointers with @asyncCall, however, coercing an async function pointer to a non-async function pointer was incorrectly allowed.
the frame is (I presume) implicitly allocated into the caller's frame
Sorry, I was thinking of async calls, not regular calls. This was the main motivation for result location semantics:
var static_frame: @Frame(foo) = undefined;
test {
static_frame = async foo();
const heap_frame = try std.testing.allocator.create(@Frame(foo));
heap_frame.* = async foo();
var stack_frame = async foo();
}
You are correct that plain calls to foo() will implicitly allocate in the caller's frame. So my commit message there was correct in identifying #2765 as the collateral damage there rather than async functions.
Async calling convention does not make foo() behave like async foo(); it makes the function have a compatible runtime function pointer type with functions that have suspend points. There is certainly an async calling convention, I mean think about how async function calls are lowered very differently than normal function calls.
Ah, okay, that all makes sense. So, to define these semantics a little more formally, here's my understanding: any comptime-known function [pointer] with default (Unspecified) callconv can coerce to a function [pointer] with Async callconv - for async functions this is the only valid way to call the function through a pointer (and we can hopefully make having a runtime-known async function pointer without Async callconv a compile error), while for non-async functions, I assume it generates a trivial wrapper matching the Async callconv? i.e. which takes a trivial frame and just "unwraps" the call to the normal function, then puts the result into the frame.
All correct.
To be clear: async foo() on a function with unspecified calling convention and no suspend points does not need a wrapper; instead it returns @Frame(foo), a trivial wrapper around its return value which is then unwrapped with await.
Sure, that makes sense. But if we instead did this:
var runtime: *const fn () callconv(.Async) void = &someNonAsyncFunction;
const frame = @asyncCall(frame_buf, null, runtime, .{});
await frame;
This would require a wrapper function, yes? Since we must make someNonAsyncFunction comply to the Async callconv where it previously did not.
Agreed, that would require a wrapper function.
Okay, I think I know everything I need to to get started on this now - thanks for the help!
Is the whole async thing something that's strictly necessary or just some syntax/language sugar? I.e. is there a workaround or are there certain things that simply aren't possible in Zig currently without proper async support?
Q2. Function Pointers
This question is about the language specification. When taking a pointer to a function, we may not yet know whether or not it is
async(as this is determined after semantic analysis). More to the point, even if we did know this, it's not a part of the function's type, so we can't differentiate between an async and non-async function when calling a runtime-known function pointer. Since the code generated for calling async and non-async functions is necessarily different, how is this handled? As far as I can tell from looking through code from the old C++ compiler implementation, this just wasn't handled at all before, and you'd probably just get a crash if you attempted to call an async function through a pointer. The best solution I can think of to this is just that you can't take a reference to an async function (a compile error is emitted if you try) - we could emit these errors retroactively after determining a function is async - but perhaps there's a better solution I'm not seeing.
I believe there are likely better options than simply crashing or disallowing pointers of an async function. An async function could have two entry points. For instance, the first entry point could adhere to the normal function calling convention and determine where to jump to continue execution, along with how to restore the frame state. Essentially, it would be akin to wrapping an async function in a normal function, which is not unusual since Zig, to the best of my knowledge, it aims to be color-free. The second entry point would be invoked if the function was directly called as an async function.
If you fix the code size of the first entry point, it should still be possible to cast a function pointer back to an async function, as you could always calculate the position of the second entry point from that. This doesn't pose a significant issue; for example, the first entry point could always be a backward jump to a larger chunk of machine code specific to that function. Then, the second entry point would consistently proceed after that jump instruction.
Further having such a jump instruction is not unheard of at the beginning of a function. For instance hot patchable functions often would just have nop instructions or a jump instruction jumping x bytes forward at the beginning. These could then could be later modified to a jump to the new patched function.
Is the whole async thing something that's strictly necessary or just some syntax/language sugar? I.e. is there a workaround or are there certain things that simply aren't possible in Zig currently without proper async support?
On paper async does enable things you couldn't otherwise achieve. Coroutines allow the 'pausing' and 'resuming' of functions, something you really can't achieve without inline assembly; you can only emulate this in various ways. Async allows control of the memory location of a function's frame. This is important to the goals of the project, as one can use this to prevent a recursive function from taking up an arbitrary amount of stack by allocating its frames on the heap.
It's possible to use zig's async for generators. I am not sure if this is the intended use case though, who knows how efficient or 'nice' this may be.
In practice many things are possible without async. Async for event loops and thread pools can be replaced by function pointers w/ state. I'm struggling to think of any practical examples where async code is strictly necessary.
Async is super useful when you have lots of smaller tasks.
Such as if you lint/validate a bunch of files. Doing each file in a separate thread is wasteful. Writing your own scheduler with semaphores is tedious. Being able to just rely on a language feature for this is super useful. (Edit: oh and especially when you want parts to depend on other parts via await, such as await lintFilesInDir("foo"))
Similar to, we don't strictly need for loops when we have while loops. But they're such a nice quality of life feature.
Is the whole async thing something that's strictly necessary or just some syntax/language sugar? I.e. is there a workaround or are there certain things that simply aren't possible in Zig currently without proper async support?
I will not enter the "philosophical" part of it, as there are many benefits, but I will give you a concrete example.
I use zig in WASM in the browser, and sometimes I need to use JS api only available with promise.
With async, it would look like this (pseudocode):
// from zig
suspend {
current_frame = @frame()
my_js_function_called_from_zig(js_function_arguments, current_frame)
}
// from JS
function my_js_function_called_from_zig(args, frame) {
mypromise_func(args).then(() => zig_resume(frame))
}
// back in zig
fn zig_resume(frame) {
resume frame
}
At present, I implement this with a JS worker that do a blocking loop with Atomic, but it is far from ideal.
Please, don't go the horrendous path of async. It'll be a massive time sink, the ABI for function calling will never be the same, resulting in function colors, and the entire ecosystem will either be split, or all async.
Could you look into stackful continuations and effects instead?
Please, don't go the horrendous path of async. It'll be a massive time sink, the ABI for function calling will never be the same, resulting in function colors, and the entire ecosystem will either be split, or all async.
Could you look into stackful continuations and effects instead?
Do you mean something like setjmp and longjmp in C? And for effect what do you have in mind?
I think that in general this is a good discussion to have. I did plenty of rust, and I hate their futures as a user even if many of the design parts make sense.
To me, it ends up to uses cases. We must find what problem we (the developers that use the language) want to solve. In https://github.com/ziglang/zig/issues/6025#issuecomment-1914725896 I explain a use case that would be covered by a non local jump. But there are more uses of async/await.
As a note, in practice, if I look at how I work with JavaScript as it is there I do the most async now. I realize that 99% of the time, I only want to write sequential imperative code. But I must have await everywhere because one function down the stack calls a JS api that must be async because the result is not immediate.
I think Apple's API such as dispatch and run loop can be of inspiration when thinking around concurrent API design.
[...] resulting in function colours [...]
Why does a function's ABI introduce function colours in any practical way? The only thing it really means is that you can't get a default-callconv function pointer to an async function; if you need to support async functions somewhere that you're using function pointers, then you can use callconv(.Async) pointers which also support non-async functions. There is technically some colouring here, but given how rare function pointers are... not in any practical sense.
Per my understanding, the goal of Zig's colourless async can be framed as that if I have a non-async project, I can throw an async call somewhere into it - potentially causing hundreds or thousands of functions to in turn become async - and it'll basically Just Work.
[...] the entire ecosystem will either be split, or all async.
A huge benefit of colourless async is that it should help avoid this problem. If Alice writes an async version of a package, and Bob a synchronous version, they should both be able to be plugged straight in to any project - regardless of whether or not it is already async - and everything should work with only minor changes.
@mlugg
Why does a function's ABI introduce function colours in any practical way? The only thing it really means is that you can't get a default-callconv function pointer to an async function;
It seems you have answered your own question
A huge benefit of colourless async is that it should help avoid this problem. If Alice writes an async version of a package, and Bob a synchronous version
Again, you've proved my point, the ecosystem would be divided into async and normal code.
@kuon
Do you mean something like setjmp and longjmp in C?
Continuations can be implemented by using setjmp and longjmp, yes.
The Wikipedia page for Continuations does a good job of explaining them, as well as providing a list of languages that support them.
It's important to understand what async is and what problems it solves. I would say there are two parts of what people consider async to be:
- Futures/Promises, these are a way of scheduling tasks cooperatively.
- async/await, these are keywords that delimit computations.
Delimiting computation means splitting up a function into multiple parts, providing the ability for each part to be executed in multiple ways, and allowing for control flow to be more flexible.
Async/await is what we usually call an implementation of a subset of stackless coroutines.
Coroutines are an abstraction that helps with delimiting computation, however, compared to just async/await they have the additional benefit of being able to yield multiple times. These are also usually stackless.
Continuations are a much lower level of abstraction for delimiting computation, however, they are much more powerful, providing the ability to capture the control state of the current computation as a first-class value. This can be used to implement coroutines, generators, and even effects.
Continuatinuations are usually implemented by capturing the stack, or by having a split stack system, this means that there is just one ABI for calling functions, the same one everyone else uses, this allows calling into C, or any other language, without any issue.
And for effect what do you have in mind?
There is also a good Wikipadia page for Effect systems.
Effects are in essence a control flow method.
They can be used to implement cooperative scheduling (as with async + Future/Promise), but also much more than that, like function purity because they can be used to abstract over "colors".
Here are some examples of effects: Concurrency, I/O, Error handling, Allocation Because effects can be abstracted upon, the user can plug-in their own functionality for either of those, and more.
Continuations and effects complement each other very well, both have been very well studied for a long time, and continuations are also implemented in quite a few mainstream languages, while effects are just now gaining more attention in languages like OCaml, JavaScript has a proposal for algebraic effects, and there are new languages that experiment with them, like Koka, Eff, Unison.
Please, don't go the horrendous path of async. It'll be a massive time sink, the ABI for function calling will never be the same, resulting in function colors, and the entire ecosystem will either be split, or all async.
Is not the whole point of Zig's async/await implementation and its focus on @Frames to not introduce this split?