zig icon indicating copy to clipboard operation
zig copied to clipboard

make `@returnAddress` work at compile-time

Open andrewrk opened this issue 8 months ago • 10 comments

Problem Statement 1: Error Reporting

For example, in formatted printing, when you make a mistake, it looks like this:

/home/andy/dev/zig/lib/std/fmt.zig:191:13: error: too few arguments
            @compileError("too few arguments");
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
referenced by:
    print__anon_672: /home/andy/dev/zig/lib/std/io/Writer.zig:24:26
    print__anon_413: /home/andy/dev/zig/lib/std/io.zig:312:47
    1 reference(s) hidden; use '-freference-trace=3' to see all references

One typically has to use -freference-trace to make the error output useful, and even then, it's not obvious to beginners which point in the trace is the relevant one.

Problem Statement 2: Mixing Optimization Modes

Let's say I want to compile all my dependencies with ReleaseFast but I want my application code to use Debug. Sounds good in theory, but the problem is that if the std lib is compiled in ReleaseFast it means that all the safety checks which live in ArrayList, for example, will now be disabled, even though those safety checks were logically covering behavior of the application.

In reality it is not the optimization mode of the standard library that ArrayList wants to use, nor the optimization mode of the application. Generally, it is the mode of the call site where the type is constructed, that is the proper one.

Proposed Changes

@returnAddress() can be called at compile-time which returns a value representing the caller in the comptime stack. In such case, it makes the current function generic as if it took an additional comptime parameter.

This value can then be passed to @compileError, causing it to report the error at that location. This solves the first problem completely, resulting in always reporting the error in the correct place.

And then next there needs to be a way to access the builtin import (@import("builtin")) relative to a given comptime callsite. Inventing syntax for that, combining with #978, we end up with something like this in ArrayList:

pub fn ArrayList(comptime T: type) type {
    const mode = @import(@returnAddress(), "builtin").mode;
    return struct {
        comptime {
            @optimizeFor(mode); // note this applies to the struct scope
        }
        // ...
    };
}

This means that ArrayList(u8) created by one module will not equal ArrayList(u8) created by another module, if those modules have different optimization modes. That might be a little confusing to encounter, but I think it's a fundamental complexity that arises from the extremely compelling use case of mixing optimization modes. The mitigation is quite reasonable: when an array list is used as part of a given API, that API should create a type alias for API users, so that they share the same version. Or, if the array list will be primarily used by the API user, then it should be constructed with the optimization mode of the API user. Either choice will be available to module authors.

For interoperability with @src(), @src() would be changed to accept the comptime callsite value as a parameter, and could then be redefined like this:

pub inline fn src() Src {
    comptime return @src(@returnAddress());
}

Note this also makes it possible to get source information without an explicit parameter, at the cost of making the function generic.

The result of calling @returnAddress() should probably have its own type, or it could be an opaque defined in std.builtin.

Finally, it might make sense to introduce a new builtin rather than double-purposing @returnAddress(). The deciding factor would be whether anyone could come up with example code that would work both at runtime and at comptime that uses @returnAddress().

andrewrk avatar Apr 09 '25 02:04 andrewrk

Technically a duplicate of #14938, but I'll close that one since this one is more detailed.

mlugg avatar Apr 09 '25 12:04 mlugg

Observation: this could be a quite neat solution to other optimize-mode-dependent things in std, for instance:

//! std/debug.zig
pub inline fn runtimeSafety(ra: ?usize) bool {
    return switch (@import(ra orelse @returnAddress(), "builtin").mode) {
        .Debug, .ReleaseSafe => true,
        .ReleaseFast, .ReleaseSmall => false,
    };
}

// usage:
// if (std.debug.runtimeSafety(null)) { ... }
// or
// if (std.debug.runtimeSafety(caller_ra)) { ... }

(This isn't necessarily a vote in favour of this proposal, but it is interesting to see what it allows.)

mlugg avatar Apr 09 '25 12:04 mlugg

I like this proposal for solving comptime stacktraces, this is a very annoying problem especially the first time you encounter it.

Howerver, allowing an implicit comptime parameter feels like it is a big footgun. This would allow 2 types that look the exact same to the programmer to be different.

It also makes implementing a bad API much simpler. For example forgetting to create a type API, or implicitly assuming that some parameters are the same between 2 different modules are the same, which may be missed in testing.

All in all, I don't think this would be more beneficial than passing a reference to builtin, or more strictly mode directly to a type function, which would avoid the problem of 2 types being invisibly different.

NicoElbers avatar Apr 09 '25 19:04 NicoElbers

You're suggesting for all instances of ArrayList(T) to be replaced with ArrayList(T, mode)?

andrewrk avatar Apr 09 '25 19:04 andrewrk

I'm not sure that's the right choice either, but how I read this proposal that is what's being done implicitly. Between the implicit and explicit extra parameters I think that explicit parameters would be the better choice.

However if module specific optimization is the goal, something like 'type options' which contains options like optimization or libc usage might be an idea worth considering.

EDIT: Another reasonable idea may be ArrayListAlignedOption, like already exists with ArrayListAligned

NicoElbers avatar Apr 09 '25 22:04 NicoElbers

Here's my counter-proposal for solving the 1st problem outlined in the issue. As I understand the proposal, its solution to the problem is the following:

  • Permit calls to @returnAddress at compile time, and
  • change @src to be called with an argument, namely, the return value of some call to @returnAddress.

The combination of the two allows authors of generic functions to get information about their callers' source locations, and put that in the messages they provide via @compileErrors - thus pointing the users of their code into the actual cause of the error (from a human perspective).

I see two shortcomings to this solution:

  • It's not the default option: generic code's authors must do extra work in order to add this useful information into their printed errors.
  • It's hard to make the system work correctly: continuing with the example of formatted printing, suppose a user calls std.debug.print with bad arguments, and std.fmt.format was indeed enhanced with this proposal's abilities - then std.fmt.format will issue an error pointing to its caller, which is found in... std/io/Writer.zig! The error doesn't point into the user's code, but to some (seemingly arbitrary) std function - not much of an improvement over the status quo, and perhaps even a pessimisation. Must all the call chain in std now pass in an optional extra parameter, it being the caller's return address?

Observation

At compile time, a deliberate termination of compilation generally happens in one of two ways:

  • An encounter of @compileError, or
  • an encounter of unreachable.

The two signify different intentions; the first means a violation of a precondition - the parameters passed to a compile-time-ran function were incorrect, according to the function's specification (i.e. the caller is the culprit). The second means a violation of an internal invariant - the internal logic of the function is inconsistent according to its own expectations (i.e. the callee is the culprit).

In short: whenever @compileError is called - it is always the caller's fault, so the error should always point at the call site*.

Proposal

The message of a @compileError points at the call site of the function, not at the @compileError location itself.

A new builtin "@propagateErrors" is introduced, that takes zero arguments. A function that has @propagateErrors will be transparent as far as the error reporting from @compileErrors is concerned, and will forward the error's location to its caller. This builtin is intended to be used in wrapper functions, that mostly just forward their arguments to a different function (these include all of the functions in std that end up calling std.fmt.format, that we see in the stack trace when calling std.debug.print with bad arguments).

In the example below, the error raised will point to the inside of main, at its call to bounds.

const std = @import("std");

pub fn main() !void {
    const low, const high = bounds(bool);
    //                            ~~~~~~
    // (4/4) Error: expected an integer type

    for (low..high) |i| {
        std.debug.print("I sure do love counting! {}\n", .{i});
    }
}

fn bounds(T: type) [2]T {
    // all of `minInt` and `maxInt`'s preconditions
    // are also this function's preconditions.
    @propagateErrors();
    // (3/4) error message that should have pointed here
    // is propagated upwards due to `@propagateErrors`.
    return .{ minInt(T), maxInt(T) };
}

/// Returns the maximum value of integer type T.
pub fn maxInt(comptime T: type) T {
    const info = @typeInfo(T);
    const int = switch (info) {
        .int => |i| i,
        // (1/4) compile error originates here
        // call stack: main -> bounds -> minInt -> maxInt
        else => @compileError("expected an integer type"),
    };
    return (1 << (int.bits -| @intFromBool(int.signedness == .signed))) - 1;
}

/// Returns the minimum value of integer type T.
pub fn minInt(comptime T: type) T {
    // all of `maxInt`'s preconditions are
    // also this function's preconditions.
    @propagateErrors();
    // (2/4) error message that should have pointed here
    // is propagated upwards due to `@propagateErrors`.
    return ~maxInt(T);
}

*There is one place where a @compileError shouldn't point at the caller: when it was invoked erroneously - that is to mean, when the generic function's author wrote a false positive in their checks, and a set of valid inputs is wrongly marked as invalid. For these rare cases, the compiler should provide a way to see the entire call stack, right up to the actual place where the @compileError was invoked (essentially the status quo) - perhaps through a verbosity flag passed in the CLI.

Fri3dNstuff avatar Apr 11 '25 11:04 Fri3dNstuff

Another important use case is std.debug.assert use the callsite's optimization mode, rather than the standard library.

andrewrk avatar Apr 20 '25 19:04 andrewrk

Taking inspiration from Perl's solution to this problem, another possible way of addressing both (without double-purposing @returnAddress) is to add a @caller builtin. This could return a new std.builtin.CallSite type that holds the src & mode, along with additional std.builtin values like a CallModifier. Updating the @compileError builtin to take a CallSite parameter could solve the first problem:

@compileError(@caller(), "too few arguments");

Additionally, @call could replace the std.builtin.CallModifier parameter for a std.builtin.CallSite, and this would provide a mechanism for mixing optimization modes:

@call(.{ .mode = mode }, ArrayList(u8).init, gpa);

benburkert avatar Apr 25 '25 00:04 benburkert

an interesting, but rare, use case for @returnAddress at comptime is in std.debug.panicExtra / panic handler. At runtime, it is given an address to dump the stack for, but at compile time that doesn't work. If @returnAddress() gave something usable at comptime, the panic handler could have:

    if (@inComptime()) {
        @compileError(msg, first_trace_addr orelse @returnAddress());
    }

which would show the formatted panic message, as well as point at the desired return address. i called this a rare use case because i can't recall a single time i've hit a formatted panic in comptime (barely if ever do i hit @panic at comptime).


i worry that @returnAddress sometimes making the function generic might make it hard to tell where in the code a function is being compiled multiple times. right now the only way i'm aware a function can be generic is via comptime or anytype parameters. would it be easy to accidentally make a function generic without realizing it. i've noticed with incremental how bad the misuse of inline fn can make rebuilds so slow, i don't want that kind of analysis explosion to happen here. maybe there is a way that comptime return address can avoid making the function generic?

for example, this shouldn't be generic ever since it uses a runtime known return address.

fn doSomething(list: *std.ArrayList(usize)) !void {
    try list.append(@returnAddress());
}

but it would be a shame if doSomething here is analyzed twice, to account for the two different calling locations. especially if someone discovered that they can add nicer error locations to their existing code and unintentionally re-compile the function way more than needed.

fn doSomething(comptime s: i32) i32 {
    if (s % 2 != 0) {
        @compileError("s is invalid", @returnAddress());
    }
    return s / 2;
}

fn main() void {
    _ = doSomething(4);
    _ = doSomething(4);
}

paperclover avatar Apr 25 '25 05:04 paperclover

How would @import(@returnAddress(), "builtin").mode work with an @optimizeFor() in the caller's scope? Would the returned mode be the global optimization mode of the caller's module or would a separate builtin.zig be generated for each optimization scope? If it is the latter, would this warrant removing mode from builtin entirely and moving it to dedicated compiler builtins (similiar to @setFloatMode()?

Justus2308 avatar May 31 '25 14:05 Justus2308