zig
zig copied to clipboard
Proposal: `@Result` to match cast builtins inference API
Note: This is a formalized proposal based on discussion here: https://github.com/ziglang/zig/issues/5909#issuecomment-662684509
Proposal started with the addition of
@ResultType
, likepub inline fn intCast(x: anytype) @ResultType
. It has been revised to use@Result
andanytype
in a function declarations,pub inline fn intCast(x: anytype) anytype
Problem: Builtin-exclusive inference API
With the recent merging of https://github.com/ziglang/zig/pull/16163, cast builtins now use an API that can't be replicated in regular userspace.
For example, take the std.fmt.parseInt
function:
fn parseInt(comptime T: type, buf: []const u8, base: u8) std.fmt.ParseIntError!T
which currently has to be used like:
const foo = std.fmt.parseInt(u32, content, 10); // T must be explictly provided
compared to the usage of a cast builtin which can be used like
const bar: u32 = @intCast(x - y); // T is inferred
For the sake of consistency, it seems like parseInt
should be able to be used in a similar fashion.
const bar: u32 = try std.fmt.parseInt(content, 10);
Proposal: @Result
The introduction of a builtin @Result
could allow the declaration of std.fmt.parseInt
to look like this:
fn parseInt(buf: []const u8, base: u8) std.fmt.ParseIntError!anytype {
const T = @Result();
...
}
pub fn main() !void {
const word1 = "12";
const word2 = "42";
const foo: u32 = try parseInt(word1, 10); // @Result is u32
const bar: u64 = try parseInt(word2, 10); // @Result is u64
...
}
Benefits
- This democratizes this kind of inference API currently exclusive to builtins.
- This may improve the consistency of callsites of functions with inferable types.
- Could benefit from cast-builtin-related improvements in type inference resolution, like in the case of something like:
// possible builtin inference improvement const foo: u32 = @intCast(a + b) * @intCast(c); // potential downstream benefit const bar: u32 = try parseInt(word1, 10) + try parseInt(word2, 10);
- This also allows for the user to implement a function that looks like cast builtins from a caller's perspective, like this (via https://github.com/ziglang/zig/issues/5909#issuecomment-1099710303):
pub inline fn intCast(x: anytype) anytype { return @intCast(x); }
Drawbacks
- Allows functions with identical functionality to be defined with 2 different APIs.
When should a user define a function with
@Result
inference vs. having acomptime T: type
parameter? - ?
When should a user define a function with @ResultType inference vs. having a comptime T: type parameter?
If you're returning T
, never according to the new logic in zig, since you can just wrap it in an @as(T, f(...))
.
One big drawback of this is that you're adding an invisible comptime parameter which silently adds more instantiations of your functions.
I really like having custom casting functions - f.e. enhanced by comptime
type checks specific to a use case.
With the benefits of the new inferred result type already visible in some code, I'd hate to give them up when switching from builtins to user-space functions.
In general the interface of @ResultType()
providing the type of the result location to me seems minimal and sufficient, so fit for Zig.
However, the original proposal text glances over type unwrapping a bit, and I think it's a rough edge worth bringing up specifically:
The cast builtins currently already unwrap error unions and optionals: const c: error{Z}!?u8 = @intCast(@as(u16, 40));
deduces u8
.
It would be easier to automatically do the same thing in @ResultType()
.
IMO hiding this step from the user (and discarding the additional information) would be a bit of a shame though, but I can't quite figure out how to make it work.
- If given the full type, userspace can implement these steps manually, f.e. in a helper function, making the return type something along the lines of
NonOptionalPayload(@ResultType())
. Note that this will practically work as long asNonOptionalPayload(@ResultType())
can implicitly coerce to the actual@ResultType()
,T
->E!?T
will always work. - However, concluding from the last bullet point, the return type
std.fmt.ParseIntError!@ResultType()
constructed in one example would probably not work: No matter whatR = @ResultType()
we provide, the returned typeE!R
can never match the originalR
expected/deduced from the expression's result location. ... That is, unless result locations were propagated through (error-unwrappingif
and)try
expressions, and then strippedE!T
->T
.
Maybe we would want both options, accessible as @ResultType()
and @ResultTypeErrorUnionPayload()
/@NonErrorResultType()
?
Or we choose not to provide the ~~second~~ first one, because behavior dependent on the callsite error set is too implicit.
Well, I can't really think of a non-confusing use case right now, so maybe it can actually be that simple.
One big drawback of this is that you're adding an invisible comptime parameter which silently adds more instantiations of your functions.
That is true; in my custom meta
functions I often have a nicer interface in f
that wraps an fImpl
function with more verbose signature.
Especially when stripping a type (like ?T
-> T
) is done in userspace, you would probably want an Impl
function like that to deduplicate the instantiation.
For me this is an okay approach in meta
code, while in other areas it could get rather crowded.
Although for explicitness here's another idea: Instead of implicitly passing a type
for @ResultType()
to read from, we could instead make it an explicit parameter declaration:
fn(comptime R: type = @ResultType(), x: R) R {return x;}
Downsides:
- We don't have assignment syntax in arguments for any other use case yet.
- We declare a parameter slot that needs to be omitted from call sites.
The most in-line with current syntax would actually be a capture IMO:
// `keyword |capture|` has precedence in Zig
fn |R| f(x: R) R {return x;}
// could use an additional keyword
fn resulttype |R| f(x: R) R {return x;}
// (could use an operator, but no precedence of this in Zig, therefore I like it less)
fn -> |R| f(x: R) R {return x;}
// could also put it after the function name, so `fn <name>` remains Ctrl+F-/grep-/searchable
fn f |R| (x: R) R {return x;}
I like this idea a lot
What if it was infer
or @Infer
instead? One could imagine a future follow-up proposal for constraining the type to be an integer without specifying the size or implement a handful of functions like isLessThan
, etc
related, infer
keyword is proposed here https://github.com/ziglang/zig/issues/9260 too, the syntax actually pairs pretty well
@Jarred-Sumner So you're thinking something like this instead?
fn parseInt(buf: []const u8, base: u8) std.fmt.ParseIntError!infer T {
...
}
The infer
keyword might communicate the existence of multiple function instances better than a builtin @ResultType
.
That could also be complementary if the infer syntax was already available to be used on parameters (as in the mentioned #9260).
fn foo(bar: infer T) infer K {
...
}
I like this proposal as is, but I can also see a slight modification:
@ResultType()
seems to function basically the same as anytype
for parameters, in that they both implicitly make the function polymorphic. So reusing anytype
in the function signature makes sense to me. @ResultType()
would often still be necessary to access the actual result type within the function, but it could be made to only be usable within a function body.
Advantages:
-
anytype
is shorter and easier to read/type than@ResultType()
-
@ResultType()
can be used even when the result type is specified explicitly (removes redundancy when the result type in the function signature is a complicated comptime expression) -
anytype
communicates intention better when used in a function signature (possibly subjective, but IMO it makes it clear that you need to use comptime checks if you want to limit what the result type might be, just like withanytype
parameters) - The
@ResultType()
intrinsic and inferred-return-type-via-anytype
could be implemented as two separate, smaller features
Disadvantages:
- Perhaps slightly more difficult to learn - two different keywords to remember (in the context of return type inferrence;
anytype
already exists)
related: #447
I think the anytype
return type also is a valid option, but if so, @ResultType()
is a terrible name and I would much rather have @ReturnType()
I like @ReturnType()
much better than @ResultType()
. Its way clearer that the type is inferred from the function return location, wherever that may be.
Hmm, personally both feel pretty similar. What about @CallsiteType
?
To me Return
and Result
can both be read to mean the result returned / decided by the function itself.
However, there is at least precedent for the term "Result Location Semantics" in Zig's nomenclature.
I agree CallSite
(or maybe CallSiteDestination
if there is some propagation through expressions like try
) would be more explicit, imo preferable.
(nitpick note: Wiktionary lists both call site
and callsite
although the first seems preferred, Wikipedia also went with the first spelling. Langref is currently unopinionated at 5 vs 5 occurrences.)
As it was already mentioned in other discussions it's a bit cumbersome to use @as
to specify the return type of an expression. For example
std.log.info("{}", .{@as(u32, try std.fmt.parseInt(content, 10))})
In the same time Zig already has a perfect way to specify the type of a constant/variable/function-argument with a colon syntax:
const x: u32 = try std.fmt.parseInt(content, 10);
It would be really nice to allow the same syntax to specify the type of an arbitrary expression (like in Julia lang, but they use double colon for this purpose). So the example above will look like this:
std.log.info("{}", .{try std.fmt.parseInt(content, 10): !u32})
Another example from here with buildins. Old syntax:
return @intCast(@as(i64, @bitCast(val)));
New syntax:
return @intCast(@bitCast(val): i64);
I don't have a strong opinion either way on this proposal, but I have a few notes:
-
@ReturnType
would IMO be a very poor name; that sounds like something you'd use in a function body to get the return type of the function.@ResultType
is a clear name which uses the lingo of RLS (rightly, since that is where this feature comes from). Alternatively, returninganytype
or someinfer T
(if #9260 gets in) would also be reasonable, since it would show that this works a lot like a generic parameter in that it creates a separate instantiation. - Builtins using an API which can't be replicated in userspace is not new or controversial, and in fact is the norm. For instance,
@min
/@max
/@TypeOf
/@compileLog
are varargs,@field
is an lvalue, and@import
's argument must be a string literal. - @log0div0, if you want to seriously propose that syntax it should be a separate issue.
@mlugg
* `@ReturnType` would IMO be a very poor name; that sounds like something you'd use in a function body to _get_ the return type of the function. `@ResultType` is a clear name which uses the lingo of RLS (rightly, since that is where this feature comes from). Alternatively, returning `anytype` or some `infer T` (if [Proposal to improve the ergonomics and precision of type inference in generic functions #9260](https://github.com/ziglang/zig/issues/9260) gets in) would also be reasonable, since it would show that this works a lot like a generic parameter in that it creates a separate instantiation.
Yes, the @ReturnType()
was in response to the anytype
keyword for return type. that's exactly what the premise is there. It would refer to the return type of the function, not the inferred type from the call site, there would be a level of indirection where the function return type says "infer the return type" and then you're saying "use the return type, no matter if inferred or not."
If we have
fn a() anytype {
return std.mem.zeroes(@ReturnType());
}
I have a mostly neutral stance on this proposal.
Just a though on the name, let's have the following code:
pub fn build_something() @ResultType() {
var something: @ResultType() = undefined;
switch (@typeInfo(@ResultType()) {
....
}
return something;
}
It doesn't really work with anytype
in the body (same with infer T
):
pub fn build_something() anytype {
var something: anytype = undefined;
switch (@typeInfo(anytype) {
....
}
return something;
}
It should support assignment:
pub fn build_something() @ResultType() {
var T = @ResultType(); // comptime
var something: T = undefined;
switch (@typeInfo(T) {
....
}
return something;
}
From here, we could argue that the following would also make sense: (@N00byEdge suggestion)
pub fn build_something() anytype {
var T = @ReturnType(); // comptime
var something: T = undefined;
switch (@typeInfo(T) {
....
}
return something;
}
Now the argument I could have for it, is easier code maintenance if we have something like:
// myfield has type IsoDate
mystruct.myfield = deserialize(IsoDate, str);
// If we change myfield type, we need to find the deserialize call and change the type too
// with proposal
mystruct.myfield = deserialize(str);
// No need to change the type here
For types that could be automatically casted, this could prevent some bugs and ensure the returned type is always exactly the type of the destination variable that the returned value is assigned to (similar to why first argument was removed from builtins). Taking this into account, I could be slightly in favor of this.
@kuon Basically agree with everything you've got here. I too am neutral on this.
Now the argument I could have for it, is easier code maintenance if we have something like:
But we have to be careful, that is an argument against it too, as mentioned before:
One big drawback of this is that you're adding an invisible comptime parameter which silently adds more instantiations of your functions.
One big drawback of this is that you're adding an invisible comptime parameter which silently adds more instantiations of your functions.
I don't think this is actually a very strong argument. Comptime arguments are already passed the same as normal parameters, meaning a function's call-site can't actually always tell us when passing a different value would incur an extra instantiation.
For example:
foo(1, 2);
foo(3, 4);
Assuming one or both of the parameters of foo
could be comptime
, there are potentially two generic instantiations of foo
here, and you can't know that until you look at the function prototype. This is also a characteristic of inline
paramters, which will make a function instantiate a runtime variant, or any corresponding comptime variant depending on the comptime-known-ness of the argument.
This feature would have the same drawback as all of existing status quo and the accepted proposal (having to read the function signature to know whether it will generate separate instantiations).
I agree with @InKryption that there are many places that function instance can be created transparently. I think this feature would be similar to functions accepting anytype
.
On embedded platform, when I want to limit the use of some functions to reduce binary size, I work around this problem by adding comptime assertion to the type. For example, if I want a u8
and u16
generic but not signed or other size, I do something like:
pub fn something(myint: anytype) void {
switch(@Type(myint)) {
u8, u16 => {},
else => @compileError("type not supported"),
}
}
I actually have helpers functions for that, but you get the idea.
Full support for this proposal! It would be so much more ergonomic and readable to be able to use this in userspace, such as with std.mem.zeroes()
for setting default field values.
I'd like to propose @Result()
as the name of the builtin. No hungarian notation; PascalCase already indicates it is a type (see @This()
)
What will happen if the call site does not have a specific type, e.g. due to peer type resolution or anytype?
the same you'd get with the cast builtins: error: @intCast must have a known result type
and note: use @as to provide explicit result type
There seem to be a lot of suggestions here on how to signal to the compiler to infer the output type of a function. Perhaps NOT specifying any output type should just mean "infer it"? It's hard to beat 0 characters as far as syntax minimalism goes.
fn build_something() {
var something = ..;
return something;
}
Yeah, just inferring is the cleanest imo.
There seem to be a lot of suggestions here on how to signal to the compiler to infer the output type of a function. Perhaps NOT specifying any output type should just mean "infer it"? It's hard to beat 0 characters as far as syntax minimalism goes.
fn build_something() { var something = ..; return something; }
I agree, that it's a clean option, but now you've broke one of Zig's core principle of being very readable, even by people not familiar with Zig, now If a random user want to know the possible return type of the function, he/she will have to look for every places where that function is called to try to find a type, whereas an explicit @Result() or @ReturnType() is explicit enough that anyone, can understand that this functions return type is inferred depending on the call site. Maybe I'm stupid but If I try to imagine myself going into a code-base and finding function prototype that don't return anything I'd be pretty confused about what they are doing, especially If I see a return keyword.
I agree, that it's a clean option, but now you've broke one of Zig's core principle of being very readable
Wouldn't not specifying a type with const
or var
constitute the same thing?
Wouldn't not specifying a type with const or var constitute the same thing?
Exactly, const a: anytype = f(..)
is not more readable than const a = f(..)
, it's less writeable though.
There's an important distinction to make between the proposed @Result
builtin and what many people would expect of inferring the return type of a function: the @Result
builtin gives the function's result type (as determined by the function's callsite), not an inferred type based on the return
expressions in the function itself. To give an example of where this might be confusing:
fn add(a: u32, b: u32) @Result() {
return a + b;
}
const c = add(2, 2); // error: call to 'add' must have a known result type
(the error here is by analogy with existing builtins which rely on their result type: https://github.com/ziglang/zig/issues/16313#issuecomment-1739865764)
If defining a function without any return type had this behavior, it would be very confusing for new users who might expect the return type to be inferred as u32
, or who just forgot to write the return type and are then confronted with a bizarre and unexpected error at the callsite of the function (rather than at its definition).
Additionally, regardless of how the return type is written for such functions (omitted, anytype
, etc.), the @Result()
builtin (or something equivalent) would still be needed to access the result type from within the function body. For example, implementing parseInt
using an inferred result type (as in the original proposal description) would require the use of @Result()
in the function body to determine what type of integer needs to be parsed.
My hope is that the following would work.
fn add(a: u32, b: u32) {
return a + b;
}
const c = add(2, 2);
In this case, the compiler knows that add
can only accept u32
s as inputs so the comptime_int
s at call-site will be cast to that type. The compiler also knows that add
can only return u32
so when I assign its output to c
, c
can never (in a non-confusing world) be anything but u32
. If I can do this logic in my head, then so can the compiler. Of course, the effect of this on compilation speed needs to be determined. It also gets trickier when the input to the function is anytype
, comptime_int
or comptime_float
in which the case the output type may be ambiguous and a call-site type declaration or explicit casting may be necessary to resolve this ambiguity.
In some ways, making the above piece of code work is orthogonal to (and perhaps easier than) being able to access the inferred output type in the function body itself with @Result()
. This is because if the code in the function body depends on the inferred output type and the inferred output type itself naturally depends on the code in the function body, then there is a circular dependence. This is probably a recipe for trouble.
This proposal is not really about inference of function return types by their body contents. If you have suggestions around that, that might belong in a different proposal.
The goal of this proposal is to enable callsite-inferred generics. I think it's important to keep the anytype
in the function signature to signal that this is a generic function. Moving the anytype
from the list of parameters to the return type in the function declaration doesn't lose readability IMO, plus it gains inference ergonomics (and arguably readability) at the callsite.
Changes to builtins in the last year allowed for more readable casts, for example, and this proposal is just about enabling those same semantics in non-builtin user code.