zig Proposal: Stack-Capturing Macro

Summary

Stack capturing lambdas are normally unsafe, since the lambda may have a lifetime longer than the calling function. This proposal introduces a new type of lambda that is allowed to perform stack captures, but obeys special rules that allow the compiler to prove that these captures are safe. This proposal is based on Kotlin's concept of inline lambdas, though deviates from that implementation in some ways.

Description

Unlike functions, Stack-Capturing Macros (or "macros" for short) are not values. They may only exist as parameters to functions which are marked inline. You cannot create a variable or struct that contains a macro.

Macro parameters are declared with a macro prototype, which appears in a parameter list. Macro prototypes are not type declarations. They are a new syntax, similar to a function type declaration. Unlike function types, macro prototypes may use the !T syntax to infer error types from the macro body. When passing a macro as an argument, the token => is used to indicate that the following expression is a macro. The capture syntax (|x|) is optionally used to capture the macro parameters, and after that comes an expression (which may be a labelled or unlabelled block). "return" and "try" within this scope are relative to the function, not the macro. break :label may be used to exit to an outer scope. Doing this behaves as if the function and macro were inlined, and runs any defers in the function or macro.

/// iterates the entries in a bucketed hash map
inline fn foreach(self: *@This(), action: macro (key: *Key, value: *Value) void) void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            action(&node.key, &node.value);
        }
    }
}

fn countValues(self: *@This(), value: Value) usize {
    var count: usize = 0;
    self.foreach(=> |_, v| {
        // references to stack variables
        if (v.* == value) count += 1;
    });
    return count;
}

fn hasValue(self: @This(), value: Value) bool {
    return blk: {
        self.foreach(=>|_, v| {
            // nonlocal return
            if (v.* == value) return true;
        });
        break :blk false;
    };
}

fn findKeysWithValue(self: @This(), value: Value, allocator: *Allocator) ![]*Key {
    var list = std.ArrayList(*Key).init(allocator);
    errdefer list.deinit();
    try self.foreach(=> |k, v| {
        if (v.* == value) {
            // this try refers to the outer scope.
            try list.append(k);
        }
    });
    return list.toOwnedSlice();
}

// macro returning value
inline fn pollUntil(condition: macro (index: usize) bool) void {
    var index: usize = 0;
    while (!condition(index)) : (index += 1) {
        poll();
    }
}

fn pollFourTimes() void {
    pollUntil(=> |x| x >= 4)
}

// macro with no parameters
inline fn retry(comptime T: type, max_attempts: usize, action: macro () !T) !T {
    var attempt: usize = 1;
    while (true) : (attempt += 1) {
        if (action()) |_| {
            return;
        } else |err| {
            if (attempt >= max_attempts) return err;
        }
    }
}

fn openNetworkFile(path: []const u8) !Handle {
    return retry(Handle, 3, => blk: {
        doSomeSetup();
        defer doTeardown();
        break :blk openNetworkFileUnstable(path);
    });
}

Because they cannot exist in variables, there is no TypeInfo entry for a macro. There are also no macro types. @TypeOf(macro_param) is a compile error. You would never need to do this though, because macro parameters are always known to be macros within the text of the function. When inspecting the TypeInfo of a function, macro parameters show as having a null type, just like anytype parameters. We could add more info to the FnArg structure if necessary to handle this case.

Macros may not be closed over in a function literal, but they may be referenced from within other macros, and they may be passed directly as a macro parameter to another inline function. Macros may accept macros as parameters, but only if the compiler can inline them all. Y-combinators which do not terminate at compile time will not work. This is governed by the eval branch quota. Inlining a macro into itself, directly or indirectly, counts as one backwards branch.

Unlabelled break and continue don't work in a macro in a loop. This prevents a potential class of errors:

for (some_list) |item| {
    item.inner_list.foreach(=> |inner| {
        if (inner.value == item.target) break;
    });
}

Here the programmer intended to break out of the inner loop, but the language would interpret this as a break from the outer loop. (because the inner loop is inside the function and can't be seen). Instead, this should be a compile error and a labelled break can be used to get out of the inner loop:

next_list: for (some_list) |item| {
    item.inner_list.foreach(=> |inner| {
        if (inner.value == item.target) continue :next_list;
    });
}

Macros provide a way to allow stack references in a way that is provably safe.

Edit: This proposal previously read "If a macro has no parameters, the capture list is omitted." This would be ambiguous with an expression block, so it was changed to require an empty capture list.

Edit 2: Macros were originally blocks, but upon reflection they make more sense as expressions. This also means that try and return in a macro refer to the outer function, rather than the macro expression. And a token (=>) is needed for declaring a macro. Also added a bunch more examples.

Nov 03 '20 21:11 SpexGuy

This is a surprisingly elegant solution to the "modern usecase of lambda expressions". I'm not sure if zig needs this, but the basic idea is quite nice and clean. But some questions:

Can i store a macro in a const? I might need a predicate or a sort criteria multiple times:

var threshold = 0.4; // runtime value
const is_included = |x| { return x > threshold; };

var count: usize = 0;
count += my_list_1.countAll(is_included);
count += my_list_2.countAll(is_included);

Can i call macros after declaring them? This would solve #5610

const err_or_void = ||{
    try foo();
    try bar();
}();
someCleanup();
if(err_or_void) |_| { } else {
    std.debug.print("oh no!\n", .{});
    return error.Unexpected;
}

I'm a fan, not sure though if zig needs this feature…

Nov 03 '20 23:11 ikskuh

Can I store a macro in a const?

The proposal as written would say no, but this is an interesting use case. I think I'm still inclined to say no though. A property of macros as described above is that they execute only as part of the statement in which they are declared. You can't declare a macro on one line and then execute it on a different line. This means there is never any confusion about whether a macro is capturing the current value of a stack variable or a reference to the variable - it's both. This also ensures clearly that it's impossible for a macro to escape the bounds of the locals it references. A macro really isn't a function that executes later, it's a block that executes right now, but in a deeper scope.

Can I call macros after declaring them?

No, but you can do this:

const localTry = inline fn(blk: macro () !void) !void {
    try blk();
}

const err_or_void = localTry(||{
    try foo();
    try bar();
});
someCleanup();
if(err_or_void) |_| { } else {
    std.debug.print("oh no!\n", .{});
    return error.Unexpected;
}

I'm a fan, not sure though if zig needs this feature…

That's kind of how I feel too :P

Nov 04 '20 00:11 SpexGuy

I absolutely love the basic idea of this proposal. Most languages simply don't have a reasonable way to abstract over code blocks that only differ in a few lines of code and a few dynamic variables. Function pointers and non-capturing anonymous functions help somewhat, but usually have a substantial cost in both verbosity and performance. In my opinion, the proposed functionality could considerably increase the power of the language for implementing various types of iterators, filters, generic algorithms and context-wrapping functions (like with_open_file or with_drawing_context). With this, you get 80% of the power of Lisp macros without any AST-manipulations and custom evaluation semantics.

Nov 04 '20 00:11 zzyxyzz

I'm not as much of a fan. If I understood this correctly (and maybe I overlooked something), we can already get 80% of same effect by

creating a local struct { fn action(self: *@This(), ...) !void } to encapsulate variables and methods to operate on them,
pass it as actor: anytype and
call it as actor.action.

The overall design makes sense to me, but I don't appreciate the individual changes either:

If a macro has no parameters, the capture list is omitted.

return and try can now (with the same syntax) only apply to an anonymous block, if that block is contextually passed as a macro argument. If we want block-scoped try, then instead how about try :block syntax? (We already have break :block for "local returns".)
Unlike functions, Stack-Capturing Macros (or "macros" for short) are not values. They may only [...]

This makes sense in the proposed design, but status-quo local structs simply don't have these limitations. So if they get in your way, you need to "upgrade" from macro back to a full struct.

I personally appreciate the idea of one way to do things too much over having this "shorthand option" and having to decide between them (potentially for every single interface individually). To me it seems more valuable to unify/simplify common use cases of the current language constructs. Some ideas for the use cases presented here:

scoped try :block <expression> syntax
I think I would implement "nonlocal break" by returning an enum {continue, break} to the foreach function. If we change that to union {continue: void, break: label_name} the foreach could exit and return the label to break from to the caller. So now what's missing would be a way to break to the right label based on an enum value, f.e. breakto <expression> [optional value]. (We could even limit the expressions to ConstantLiteral syntax .labeled_block so that it's all comptime-checkable.)
For capturing stack variables, maybe we could go the opposite direction and allow usingnamespace on a stack-local struct instance. That way we could do this:

var stackToCapture = struct{
  count: usize = 0;
  fn action(self: *@This(), key: anytype, value: anytype) !void {...}
} {};
usingnamespace stackToCapture;
//refer to `count` as if it were a local variable directly, pass &stackToCapture as callback-with-data

Nov 04 '20 04:11 rohlem

The proposal as written would say no, but this is an interesting use case. I think I'm still inclined to say no though.

Yeah that would be my opinion as well. It would get too messy quickly and most macros are unique anyways...

No, but you can do this:

const err_or_void = localTry(||{
    try foo();
    try bar();
});

I really like this, we created a new language feature out of macros already, killing two birds (#229, #5610) with one stone (#6965)

If I understood this correctly (and maybe I overlooked something), we can already get 80% of same effect by

Not really. Capturing local variables either requires a lot of indirections and storing the locals in a "local variables struct", which means every normal access is prefixed or you have to capture them by hand into pointer struct fields

try :block <expression> syntax

I think this is worth a separate proposal, as this is solving a completly different use case which i discovered could be solved with this proposal

For capturing stack variables, maybe we could go the opposite direction and allow usingnamespace on a stack-local struct instance. That way we could do this:

This is confusing and usingnamespace is not a tool that should be used often. It's also just a syntactical trick.

Nov 04 '20 09:11 ikskuh

Capturing local variables either requires a lot of indirections and storing the locals in a "local variables struct", which means every normal access is prefixed or you have to capture them by hand into pointer struct fields.

I don't think there is much of a semantic difference between declaring stack variables separately and grouping them into a struct and declaring an instance of that. The only difference is in access syntax.

usingnamespace is not a tool that should be used often.

Why? As I understand it, importing symbols from a different namespace means your code depends on the symbols declared somewhere else, which is an implicit remote dependency. What I proposed is a new mechanism that doesn't currently exist. In the example I showed, it links the function code to the function-local field declarations. All local, good visibility. I don't see an argument against allowing it. If you think this shouldn't be allowed for non-local structs, then we can declare it invalid for non-locally defined types.

It's also just a syntactical trick.

That's my point. Both solutions just present syntactical convenience over something we can already do with less limitations. If field access feels too burdensome, we can enhance usingnamespace, or introduce some other syntax if you'd prefer. Otherwise we can remain as we are. The new semantics that come with this proposal seem to me like a differently-flavoured tool for the same job. Unless they permit a new use case not solvable by structs and status-quo that I'm oblivious to.

Nov 04 '20 14:11 rohlem

I quite like this proposal personally.
The one modification I would make would be to require macro arguments to be comptime

Nov 04 '20 17:11 alexnask

If a macro has no parameters, the capture list is omitted.

This is ambiguous I think, I'll change it to require an empty capture list. That's also ambiguous with the || operator but unary || isn't a thing so that should be enough to disambiguate.

return and try can now (with the same syntax) only apply to an anonymous block, if that block is contextually passed as a macro argument.

This is a fair point. This is actually where my proposal deviates from Kotlin's definition. Returns in a kotlin inline lambda return from the outer function. But this only works because the last expression in a kotlin lambda produces its value, which gives them an alternate syntax to return. I suppose we could allow a label on macros though, and use break :macro_label to yield a value. I could argue then that return from inside a macro marks invisible control flow, but you could make the same argument about nonlocal breaks.

If I understood this correctly [...], we can already get 80% of same effect [with anon functions + state objects]

It's true that you can accomplish the same computation with anonymous functions and state objects, but you cannot assume that that will be performant. Using functions here has several problems:

you are now taking the address of a local variable, so the optimizer needs to put it in memory and needs to consider aliasing possibilities
you are copying memory around and hoping that the optimizer will realize and remove it.

Essentially, using functions for this is taking something that is straightforward for the frontend (put that loop in my code and put this code in the loop), turning it into a really hard problem for the optimizer, and then hoping that the optimizer will figure it out. This problem is one of the main reasons that many game developers avoid C++ lambdas.

The proposed macros sidestep this problem by forcing the frontend to emit something that is friendly to the backend (just a bunch of scopes, no pointers to locals, no extra functions). That's what makes this different. It's more constrained than lambdas, but it guarantees that it will not generate a function call, it will not generate extra copies, and it will not generate extraneous memory clobbers. IMO these guarantees are enough to make this a sufficiently different operation from passing a function (and therefore not another way of doing "the same thing").

require macro arguments to be comptime

This brings up a problem that I didn't think about before. I think putting comptime on a macro implies that it can be executed in a comptime block, meaning that it does not capture runtime locals.

Nov 04 '20 17:11 SpexGuy

It's more constrained than lambdas, but it guarantees that it will not generate a function call, it will not generate extra copies, and it will not generate extraneous memory clobbers. IMO these guarantees are enough to make this a sufficiently different operation from passing a function (and therefore not another way of doing "the same thing").

Word. This proposal is a simple syntax transformation, should be easy to implement in both stage1 and stage2 by just doing some AST transformations whereas function pointers yield inherently worse results due to increased complexity

Nov 04 '20 18:11 ikskuh

It's true that you can accomplish the same computation with anonymous functions and state objects, but you cannot assume that that will be performant.

Indeed I was hoping that this type of transformation would be something fully in the realm of capability for an optimizing compiler. In particular, here is how my implementation would work in my head:

The code is translated to an AST and then translated to IR, and the foreach function is inlined, at some point during this. (Inlining is forced by the user, or chosen by the compiler, advisable since the unique argument type makes the instantiation unique. Same goes for inlining actor.action.)
The compiler has to analyze all three functions as one, because they have been inlined. (Also Zig compiles all code as one translation unit anyway, so cross-analysis of known functions is not very restricted.) It notices the pointer to the struct instance doesn't escape the scope of the variable declaration, therefore chooses to statically resolve all references relative to the stack frames (i.e. as local variables) instead of producing runtime pointers.

Even if Zig already had macros, I would advocate to implement these optimizations for the struct case. I'd be interested to know which step of this is unrealistic.

This problem is one of the main reasons that many game developers avoid C++ lambdas.

Interesting, I was under the impression people liked to use templates because (just like comptime parameters) they lead to the compiler inlining more code, which enables more optimization opportunities. C++ lambdas are 100% just syntactic sugar for template classes with overloaded call operators; the only barrier for optimizations using them would come from type erasure, f.e. by std::function. But then again, I might not be in the loop.

Amendment: I agree that this is "just an AST transformation" that "should be easy to implement". My concern is whether we want to facilitate this one use case with special-case syntax, type, and semantics, instead of reducing friction in the components we already have in the language.

Nov 04 '20 18:11 rohlem

I was under the impression people liked to use templates

The prevailing opinion in the game development community seems to be that templates and lambdas can be fine for runtime performance if all optimizations are turned on, but compilation time and performance in debug mode are also very important, and templates are terrible for both of these. Indeed, all of the optimizations you described are possible and will probably be done in release-fast mode. But the other build modes also matter, and so does compilation time.

That said, header parsing time is a massive sink for C++ compile time, and templates can only reasonably go in headers. Zig is better about this, so it may not be as much of a problem.

My concern is whether we want to facilitate this one use case with special-case syntax, type, and semantics, instead of reducing friction in the components we already have in the language.

This is a good and valid concern, but I don't see any way to reduce the friction enough here without introducing a new concept. Here's the status quo "null hypothesis" way to do this with functions:

/// iterates the entries in a bucketed hash map
inline fn foreach(self: *@This(), action: anytype) !void {
    for (self.buckets) |bucket| {
        var current = bucket.first;
        while (current) |node| : (current = node.next) {
            try action.each(&node.key, &node.value);
        }
    }
}

fn countValues(self: *@This(), value: Value) usize {
    var countValuesEach = struct{
        count: usize = 0,
        target: Value,
        pub inline fn each(self: *@This(), k: *Key, v: *Value) !void {
            if (v.* == self.target) self.count += 1;
        }
    } { .target = value };
    self.foreach(&countValuesEach) catch @compileError("no errors possible");
    return countValuesEach.count;
}

fn hasValue(self: *@This(), value: Value) bool {
    self.foreach(struct{
        target: Value,
        pub inline fn each(self: @This(), k: *Key, v: *Value) !void {
            if (v.* == self.target) return error.FoundTheValue;
        }
    } { .target = value }) catch return true;
    return false;
}

fn findKeysWithValue(self: *@This(), value: Value, allocator: *Allocator) ![]*Key {
    var findKeysEach = struct{
        list: std.ArrayList(*Key),
        target: Value,
        pub fn init(target: Value, alloc: *Allocator) @This() {
            return .{ .target = target, .list = std.ArrayList(*Key).init(alloc) };
        }
        pub fn deinit(self: *@This()) void {
            self.list.deinit();
        }
        pub inline fn each(self: *@This(), k: *Key, v: *Value) !void {
            if (v.* == self.target) {
                try self.list.append(k);
            }
        }
    }.init(value, allocator);
    errdefer findKeysEach.deinit();

    try self.foreach(&findKeysEach);

    return findKeysEach.list.toOwnedSlice();
}

The mental gymnastics that I have to do to try to understand this code are too much. It's not worth making the foreach function at all here, it's more readable to just write the loop in each function. In this case I would probably instead make an iterator struct and use that. But iterators can be even harder for the optimizer to process, and worse on debug performance, so that doesn't really solve the problem either. They are also much more difficult to write than the iteration loop, and require more effort from readers to understand.

I don't think the usingnamespace on instance idea helps any of these examples. The availability of the locals to the return statement is not the problem.

I think I would implement "nonlocal break" by returning an enum {continue, break} to the foreach function. If we change that to union {continue: void, break: label_name} the foreach could exit and return the label to break from to the caller. So now what's missing would be a way to break to the right label based on an enum value, f.e. breakto [optional value]. (We could even limit the expressions to ConstantLiteral syntax .labeled_block so that it's all comptime-checkable.)

Limiting this to comptime-known values doesn't actually work. There's no way to run a function at runtime that returns a comptime-known value. This would have to be some totally new system in order to be guaranteed to be statically resolved, and I can't imagine any version of that which produces readable code.

It is possible that generators may be able to help here, by making it easier to write iterators. But that also assumes a very powerful optimizer, and puts a burden on it to think more (and longer) about how to optimize.

Nov 04 '20 19:11 SpexGuy

Making this an expression |x| x and |x| blk: { break :blk x } would be a bit nicer and not need to change what try and return mean within the scope of a function which would make more sense with it being an inline AST trick and not a separate function on it's own as otherwise it becomes a special case of the two. Labled or otherwise try would fit better outside of macro definitions.

Other than that, sounds rather neat.

Nov 04 '20 20:11 tauoverpi

Indeed, all of the optimizations you described are possible and will probably be done in release-fast mode. But the other build modes also matter, and so does compilation time.

Valid point for unoptimized Debug mode and compilation time.

Limiting this to comptime-known values doesn't actually work.

I was a bit unclear. I meant that the syntax of breakto <expression> [optional value] could limit expression to be a ComptimeLiteral, but the compiler would end up creating an enum to represent all labels in the function. (Afaik automatic enumeration and a goto table is how suspend/resume point information is managed for async functions already, so basically expose an idea like that to userland.) EDIT: And I meant to switch on the result of self.foreach, which would be that enum's type - yes, that would require specifying the enum in the union return type.

Here's one more idea to help the syntax of the status-quo approach (though decidedly not the compilation complexity):

//Introduce a new builtin: @currentScopeMixin(comptime Extension: type): anytype.
//This magically makes all stack declarations visible at this point (including function parameters) be auto-wrapped into a struct instance, and adds all declarations of `Extension` to that struct type.
//Downside: Would require adding `const` fields to the language for the semantics of this builtin type. (Then again, we already have `comptime` fields just for tuples...)

fn findKeysWithValue(self: @This(), value: Value, allocator: *Allocator) ![]*Key {
    var list = std.ArrayList(*Key).init(allocator);
    errdefer list.deinit();
    try self.foreach(&@currentScopeMixin(struct{
        fn each(self: anytype, k: *Key, v: *Value) !void {
            if (v.* == self.value) {
                try self.list.append(k);
            }
        }
    }));
    return list.toOwnedSlice();
}

EDIT: fixed passing the address of the instance constructed by @currentScopeMixin, not a copy.

Nov 04 '20 20:11 rohlem

When passing a macro as an argument, the capture syntax (|x|) is used to declare parameters and a scope is opened. If a macro has no parameters, an empty capture list (||) is used.

Zentropy += 1

Nov 04 '20 20:11 data-man

~~My primary concern here is that whether or not a macro can be used depends on the enclosing function being inline, which is declared at the definition -- so, it's a syntactically nonlocal, semantically significant detail, which contradicts the very first point of "Why Zig?". For that reason, I'm against this.~~ (Addressed by below comment.)

Nov 05 '20 02:11 ghost

breakto

This solves the problem but doesn't provide the same guarantees. Instead of doing jmp address you are now doing jmp table[runtime_index], which is a much slower optimization. The opimizer might be able to make this better in some cases but it's far from clear that it always will. And if it doesn't, there's no warning or error, you just have slower code than if you had written the loop inline yourself.

@currentScopeMixin

This is interesting, but it's by definition an implicit stack capture, which is exactly what this proposal is trying to avoid. The result of this macro could easily be put into a struct or copied somewhere else and referenced later, causing stack corruption.

Making this an expression

I think this is a good idea. Blocks will still be necessary to encode a statement, but blocks are expressions so this is ok.

linkedList.foreach(|x| { sum += x; });
linkedList.sumIf(|x| x > 4);

Zentropy += 1

I'm not sure what this means, but if I had to guess based on the context you're saying that empty capture lists aren't a thing anywhere else in the language so this is inconsistent? That's a fair point. An alternative would be to have an operator that indicates that a macro is being declared. Maybe something like this?

linkedList.foreach(=> |x| { sum += x; });
sevenTimes(=> std.debug.print("this will print seven times\n", {}));

whether or not a macro can be used depends on the enclosing function being inline

I may not have been clear in the proposal, but the function which passes the macro doesn't need to be inline. Only the function receiving the macro needs to be marked inline. The inline specifier is in the function header, right next to the macro prototype in the parameter list, which is validated by the compiler. So you're not going to get a surprise compile error in one place because some other function isn't inline. The compile error will appear on the function that is trying to declare a macro prototype parameter even though it isn't marked inline. Macros and functions are not at all interchangeable, so I don't see the problem here.

Nov 05 '20 04:11 SpexGuy

Upon understanding the proposal a bit more, I think it's a good idea, but should be managed with care. I think, in particular, that there should be no special-case control flow -- keywords should require labels to snap to macro scope, as in blocks (this fits well with making macro bodies expressions, as well as #5610). Also, I don't think a macro type declaration needs parameter names, since a macro instance will use its own anyway.

An alternative would be to have an operator that indicates that a macro is being declared. Maybe something like this?

~~If we do have an operator, I think it should be infix: |x, y| => /* body */. Not that one specifically though, since it might look weird in switch arms (a => => //...) -- maybe ->, or :?~~ In typing this, I realised something -- if a comptime switch returns a macro (which should be allowed, as it doesn't have the temporal problem of assigning it), then that creates a syntactic ambiguity with payload capture:

inline_func(switch (foo) {
    .bar => |x| x < 4, // Is this a payload capture, or a macro?
    else => unreachable,
});

To resolve that, why not reuse the trick we already use to declare literals without prepending them with types -- the humble .. Declare a macro type as macro (T, S) R, write a concrete macro as .|x, y| z or .|| z. I don't think the empty bars are a problem here, since they don't play exactly the same role as in branch capture -- here, they mean "capture zero elements", whereas missing bars in a branch expression means "don't capture this one element".

I have a feeling Andrew might reject this proposal out of hand, because it looks a bit too magical -- I hope that doesn't happen. It really is incredibly powerful, and there's no magic happening if you think about it.

Nov 05 '20 06:11 ghost

breakto

Instead of doing jmp address you are now doing jmp table[runtime_index]

Again, this holds true for unoptimizied Debug mode - and actually, only if Debug mode is allowed to override user-forced inlining decisions. I honestly don't remember the performance of Debug mode ever being under so much scrutiny, but fair enough.

EDIT:

The opimizer might be able to make this better in some cases but it's far from clear that it always will.

Oh, I just now realized that the intermediate return unifying control flow is the source of this issue (as I understand it). So do we need the label enum type to be comptime-only? I guess that doesn't work because we need to differentiate them by runtime path. So we need direct control flow, a normal return doesn't quite cut it.

@currentScopeMixin

This is interesting, but it's by definition an implicit stack capture, which is exactly what this proposal is trying to avoid. The result of this macro could easily be [...]

This proposal introduces a new comptime-only type and states new rules that apply to no other type. Assuming we need these rules for the optimizations we want out them, we can put the same restrictions on the result of @currentScopeMixin, even if it reduces symmetry and flexibility:

Since we pass it as a pointer, the receiving function already shouldn't make assumptions about the pointee's lifetime.
To make sure that the pointee isn't copied, we can change the result of @currentScopeMixin from being a struct to being an opaque type. (And maybe @currentScopeMixinAddress is the better baseline then, idk.) EDIT3: This would mean that the member functions first have to @ptrCast(*@This(), self). If @ptrCast isn't valid for this for some reason, it might be valuable provide builtins @opaqueify and @deopaqueify for this. (see also non-copy proposals #3803 and #3804 which may provide an alternative without changing to opaque)

So now that we gave the same usage restrictions, here are the remaining differences between proposed inline macro syntax and proposed @currentScopeMixin[Address]:

Constructing macros needs some kind of syntax, and it seems like it would benefit from unique syntax that isn't used anywhere else - although granted, stateless lambda syntax could end up looking similar. The syntax for a builtin function returning an instance of a builtin type already exists, f.e. @typeInfo.
The idea of associating data with comptime-known functions as struct and opaque already exists, and is afaik the prevalent way of currently implementing this. That means that all code currently written for them (that doesn't copy the pointer/pointee somewhere) will automatically work with a @currentScopeMixin, and code written for a @currentScopeMixin will automatically work with a manually-crafted struct or opaque type.
To me it seems that specifying the semantics in terms of struct and opaque and optimizing their usage helps a broader spectrum of use cases than would benefit from the new macro type.
To get the scoped-try behaviour of this, I'd prefer try :block syntax usable anywhere. EDIT2: Although actually, if a block's type is deduced as ErrorUnion, this coincides with <expression> catch |e| breakto :label2 e.

I think the one thing left unaccounted for would be break-ing from within the macro-code into the caller-code EDIT:, without requiring the extra step of a function result and a manual breakto (or actually breakfrom might be a better name). Having a general mechanism for EDIT: direct control flow like this (like a comptime-only type label/scope) would make sense. It might be up to whether that makes the intermediate function's defer sequence feel like "hidden control flow" or not.

Also thank you for seriously addressing my comments.

Nov 05 '20 11:11 rohlem

there should be no special-case control flow

Yeah, I've come around to this one. I always found nonlocal returns to be the weirdest part of kotlin's version, but that's because I was thinking of them as producing lambda functions. Thinking of them as expressions/blocks makes this much less weird. I'll modify the proposal to bring this in.

if a comptime switch returns a macro

I don't think this can be allowed. Macros are cannot be values, and arbitrary expressions cannot produce them. If that can happen, then you have this problem:

// argument is block expression producing macro value
withMacro(blk: {
    var local: u32 = 0;
    break :blk => { local += 1 };
});

This macro captures a variable that goes out of scope before use. So invoking this macro corrupts the stack. That sort of stack capture was what was ruled out in #229. In theory, there is an alternate form of this proposal where macros are values and the compiler verifies at each inlining site that they don't capture any variables that are out of scope. But I felt that that could produce all kinds of complexity that doesn't make sense. (what does it mean to store a macro in a comptime struct and then invoke it in a separate call to the same function at comptime? Does it capture the locals that were available in the first call or the second? Does it just reference variables in scope by name? Would it make sense to invoke it in a separate function that has locals of the same names?)

I think macros need to be all-the-way values (with variables and struct fields and everything) or all the way not (doesn't even support assignment, not compatible with expressions), so that it's really clear to the programmer what can or can't be done with them. I don't think this restriction significantly limits their usefulness.

To be really clear, I'm not proposing modifying the Expr node at all. I'm proposing modifying ArgumentList (or ExprList as it's currently called) and ParamDeclList to accept macro declarations and macro prototypes, respectively. Since a macro parameter isn't actually a variable, I wouldn't be opposed to requiring some special token on it (eg. $myMacroParam) to turn macros in expressions into parse errors instead of compile errors.

If you need to comptime switch to determine the behavior of a macro, you can put the switch inside the macro without any loss of functionality. If you need to comptime switch based on the number of arguments in a macro, you must put that switch outside of the entire function call expression. But the same requirement holds for the function declaration. Macro prototypes are not compatible with type expressions either, so in order to have a function which takes a variable number of macro parameters you must have multiple function declarations and a comptime value which is set to one of them. I can't really think of a reason you would ever want to do this though.

Is this a payload capture, or a macro?

I actually chose the => token because of this ambiguity. It is both. You can consider a switch to be an expression which selects one of several macros to immediately run based on the switch value and match expressions.

we can put the same restrictions on the result of @currentScopeMixin

This would work, but I think if we're going to introduce those restrictions we might as well get a really convenient syntax out of it.

direct control flow like this would make sense

Yeah, I've been thinking a bit about how this would work. It's kind of like a "reverse inline". Like "I want the compiler to take this code and inline it at each return in the called inline function, so that I can have the parameter be comptime known". Something like

// mark the return value as "inline", may be captured at call site with inline_return
// return expressions must be comptime known
inline fn compareFmt(a: usize, b: usize) inline []const u8 {
    return if (a < b) "{} < {}" else "{} >= {}";
}

fn compareToFour(a: usize) void {
    inline_return (compareFmt(a, 4)) |fmt| {
        // fmt is comptime-known but may hold different values.
        // this is kind of a generalization of inline for.
        std.debug.print(fmt, .{a, 4});
    };
}

(side note, I also kind of want this for switches)

const T = union(enum) {
    a: extern struct { ptr: *u8, len: usize },
    b: extern struct { len: usize, ptr: *u8 },
};

var value: T = ...;
const len = inline switch (value) {
    // this text compiles for each possible value but they produce different machine code
    else => |val| val.len,
};

But I think if your goal is something like "iterate a data structure and return immediately if you find a certain value", this is a pretty tortured way to express that. It's also worth noting that you can emulate this inline return feature with macros, in what I think is a much more clear way:

inline fn compareFmt(a: usize, b: usize, withFmt: macro (comptime fmt: []const u8) void) void {
    return if (a < b) withFmt("{} < {}") else withFmt("{} >= {}");
}

fn compareToFour(a: usize) void {
    compareFmt(a, 4, => |fmt| {
        // fmt is comptime-known but may hold different values.
        std.debug.print(fmt, .{a, 4});
    });
}

Nov 05 '20 22:11 SpexGuy

I find the => |x| x syntax to be more confusing than .|x| x would have been since it doesn't fit the style of the rest of the language and messes with the visual parsing of the => token (may be due to haskell/idris use). Maybe some other syntax would do but => communicates this => (to) that the same as -> would.

Another note, this would be brilliant for testing as you could make testing contexts without much friction.

const text = "argument not supplied by fuzzer";
fuzz(rngFn, testedFn, .{ .@"5" = text }, .|args, result| {
  const expect = .{ 0, 1, 2, 3, 4, text, 6 };
  testing.expectEqualWithSlice(expect, try result);
});

Along with other "do-notation-like" contexts where vales need to be fixed up after.

Nov 06 '20 07:11 tauoverpi

Concerning notation: Why is the => or the proposed alternative .| | necessary? From skimming the thread, the reasons seem to be:

To distinguish macros and capture lists
To avoid empty ||

However, AFAICT, capture list expressions as currently used in the language have exactly the semantics of macros, so there is no need to distinguish them. E.g., for (elts) |x| { foo(x); } could actually be implemented in userland with this proposal: for(elts, |x| { foo(); }). (In terms of interface; The actual implementation would need a goto and you couldn't use break or continue).

The capture expressions in switch, catch, while, else, etc. also have the same semantics: they are kind of inline mini-functions that are fed the arguments specified between the | | by the parent construct, but otherwise interact directly with the parent scope like ordinary blocks. In short, a macro as proposed here.

An extra decorator is also not necessary to distinguish macros from similar objects: If it appears within the parameter list of a function it's a macro, otherwise it is one of the special constructions using capture lists. If the recieving function expects a normal function argument rather than a macro (not meant for inlining), the compiler can yell at you.

Concerning empty capture lists: True, they don't occur in the language, but that's because all current uses of capture lists pass at least one argument. But in the generalization proposed here, there are legitimate cases where you don't need to pass anything and simply execute a code block as is, so this seems like a non-issue to me.

Nov 06 '20 12:11 zzyxyzz

I like that much better, actually. Although it does mean an empty capture list has to be examined in context to distinguish it from the error set combination operator.

Nov 06 '20 12:11 ghost

Changing the error set sum to && wouldn't be too much of a change and somewhat keep the meaning of joining of the sets at the cost of looking slightly less visually appealing.

Nov 06 '20 12:11 tauoverpi

@tauoverpi & means bit-intersection, && means logical conjunction (can be seen as event intersection) in other languages. For it to mean "join"/union seems extremely counter-intuitive to me, at least from a mathematical perspective.

Nov 06 '20 13:11 rohlem

Although it does mean an empty capture list has to be examined in context to distinguish it from the error set combination operator.

Yeah, I forgot about that. The ambiguity is only in the tokenizer, though (I think). The parser itself will not be made much more complicated because of it. So maybe this particular ambiguity can be deemed acceptable.

Nov 06 '20 13:11 zzyxyzz

@rohlem issue there is that no other operator really fits either apart from variations on || and any additions to it makes it just as confusing. The other choice would be either or similar but same issue of not reading as well there.

Nov 06 '20 15:11 tauoverpi

Edit: The ambiguity is more serious than I thought, since A || blk: { break :blk error{Bar}; } is in fact legitimate and currently supported syntax. Still, since || as error set join is an infix operator, while in a capture list context it can only occur in prefix position, that should be sufficient for disambiguation. The original comment is mostly left as it was, but I'm less sure of it now.

@tauoverpi, I don't think that changing the error set join is necessary at all. ~To expand a bit on my previous comment, the ambiguity would be pretty bad if it were possible to write something like ErrorSet1 || blk: { break blk: ErrorSet2; }(). Fortunately, there is no reason whatsoever to allow that, and~ the context-dependence of || therefore becomes quite narrow: If it is encountered as an infix operator in a type context, it always means error set join. If it is encountered as the first token in a function argument, it always means empty capture list. (Unless someone can think of further complications?) The dual meaning of || can only really throw off direct pattern matching on the source text (e.g. in simple syntax highlighters or search-and-replace). But that can't be helped, and besides, there are already other things that do that, for example ! in !void, !true, != and "Hello!". Personally, I don't consider this a serious issue, but if others disagree, I'd prefer using the .|x| syntax rather than change the error set combination operator.

<offtopic> I was surprised to find out that the error set join is a fairly rare operation, with a whopping 133 legit matches in the entire Zig repo. So maybe changing its syntax wouldn't be that dramatic. But like @rohlem, I'm stictly against && on grounds of set theory, and I doubt that Andrew would be interested in @ErrorJoin(A, B, C) or something :grin:) </offtopic>

Nov 06 '20 17:11 zzyxyzz

ambiguity of ||

Yeah, this is not meaningfully ambiguous. As @zzyxyzz points out, there is already precedent in the language for this sort of context dependence, in inferred error types in return expressions. Unary not is not normally supported on types, but if it begins a return type expression it means "infer the error set". So this would be a solution.

However, I think it's inconsistent to require an empty capture list for macros but not require it in variants of if/for/switch/while with no captures. And we can't have no token and no capture list, because that's ambiguous with an unlabeled block that executes with the arguments. The compiler could figure it out, but it would be really hard for programmers to read. I'm open to other tokens besides => though, or even a keyword. But . doesn't quite fit, because without the capture list that's .{ x += 4; } which is ambiguous with anonymous structs. Maybe macro, inline, expr, or hoist?

Nov 06 '20 19:11 SpexGuy

However, I think it's inconsistent to require an empty capture list for macros but not require it in variants of if/for/switch/while with no captures.

Isn't that a bit different though, since the present or absence of the capture syntax in those changes their actual behavior and what types they work on, while an empty capture list here would fundamentally behave the same way as a non-empty one?

Nov 06 '20 19:11 travv0

However, I think it's inconsistent to require an empty capture list for macros but not require it in variants of if/for/switch/while with no captures.

On the other hand, functions that take no arguments still carry the empty (), so why shouldn't argument-less macros be required to carry an ||? The built-in constructs break this rule because it would be really ugly to write if (cond) || { ... } every time, but I see this more as a concession to practicality than a rule to be followed.

Nov 06 '20 19:11 zzyxyzz

zig zig copied to clipboard

Proposal: Stack-Capturing Macro

Summary

Description

zig
zig copied to clipboard