zig icon indicating copy to clipboard operation
zig copied to clipboard

get rid of the `.` in tuples & anonymous struct literal syntax

Open andrewrk opened this issue 5 years ago • 32 comments
trafficstars

I'm not sure if it's possible to do this, but here's an issue at least to document that we tried.

I believe the current problem is that when the parser sees

{ expression

It still doesn't know if this is is a block or tuple. But the next token would either be , making it a tuple, ; making it a block, or } making it a tuple. Maybe that's OK.

Then tuples & anonymous struct literals will be less noisy syntactically and be easier to type.

andrewrk avatar Apr 14 '20 21:04 andrewrk

This would be nicer to type, but the ambiguity with blocks could lead to some unintuitive compile errors. Consider this example:

fn voidFn() void {}

export fn usefulError(foo: bool) void {
    if (foo) {
        voidFn() // forgot semicolon here
    } else {
//  ^ current compile error: expected token ';', found '}' on first token after forgotten semicolon
        voidFn();
    }
}

Here I've forgotten the semicolon on the expression in the if statement, and the compiler gives an error at the first token after the missing semicolon. But with this proposal, the missing semicolon would cause the block to be interpreted as a tuple, equivalent to this code:

fn voidFn() void {}

export fn weirdError(foo: bool) void {
    if (foo) .{
//  ^ error: incompatible types: 'struct:42:15' and 'void'
//  ^ note: type 'struct:42:15' here
        voidFn() // forgot semicolon here
    } else {
//         ^ note: type 'void' here
        voidFn();
    }
}

For a forgotten semicolon, this is a very confusing error to get, especially for someone new to the language who hasn't learned anonymous struct syntax yet. This might be a solvable problem by identifying cases where this parsing is close to ambiguous and improving the error messages that result, but the two interpretations are different enough that it might be difficult to give an error message that is meaningful for both.

SpexGuy avatar Apr 14 '20 22:04 SpexGuy

Oh, I think there's also an ambiguous case. These two statements are both valid but mean very different things:

fn voidErrorFn() error{MaybeError}!void {}
comptime {
    var x = voidErrorFn() catch .{};
    var y = voidErrorFn() catch {};
}

SpexGuy avatar Apr 14 '20 22:04 SpexGuy

Good point on the ambiguous case, and good explanation in the other comment.

For the ambiguous case I'd be willing to change {} to mean "tuple of 0 values / struct with 0 fields" and find some other way to specify the void value. @as(void, undefined) already works, but is a bit scary. Maybe void{} if #5038 is rejected.

andrewrk avatar Apr 14 '20 22:04 andrewrk

Mixing blocks and tuples&structs is the very reason I don't like this proposal. I often forgets the brackets .{ } for printf functions but I never forgot the little dot. What new features are blocked by this proposal?

jakwings avatar Apr 14 '20 23:04 jakwings

What new features are blocked by this proposal?

nothing - this is pure syntax bikeshedding :bicyclist:

andrewrk avatar Apr 14 '20 23:04 andrewrk

I'd be willing to change {} to mean "tuple of 0 values / struct with 0 fields" and find some other way to specify the void value.

Hmm, that would definitely work, but it also might have the side effect of unexpectedly converting

fn voidErrorFn() error{MaybeError}!void {}
export fn foo() void {
    voidErrorFn() catch {
        //commentedOutCall();
    };
}

to

fn voidErrorFn() error{MaybeError}!void {}
export fn foo() void {
    voidErrorFn() catch .{};
}

when the programmer tries to test commenting out that line.

This could potentially be solved, but we would have to go down kind of a strange road. Zig currently makes a distinction for many types of blocks between expressions and statements. The if in x = if (foo) { } else { }; is an expression. As a result, it requires a semicolon. But if (foo) { } else { } is not an expression but a full statement, so it does not require a semicolon. If we changed <expression_evaluating to !void> catch { code block } to a full statement, so it wouldn't require a semicolon, we could then define {} to mean empty block in a statement context and empty struct in an expression context. This would be kind of a weird facet of the language to play into, but it would fix all of the problems I've brought up and replace them with a single weird but consistent thing. In most of the cases I've thought through, this definition can produce a useful error message if you accidentally get the wrong interpretation of {}. But this could still end up being a footgun in comptime code.

SpexGuy avatar Apr 14 '20 23:04 SpexGuy

Maybe we can change block syntax, #4412

mogud avatar Apr 15 '20 01:04 mogud

OK, I'm convinced to close this in favor of status quo.

andrewrk avatar Apr 15 '20 01:04 andrewrk

Another option - maybe already discussed, I couldn't find any discussion about it yet - is to use [] for the anonymous list literals (and tuples).

ghost avatar Apr 15 '20 01:04 ghost

Inspired by ({}) in JavaScript and (0,) in Rust:

Idea No.1:

{}              // empty block -> void
({})            // still empty block -> void
                   because block is also expression instead of statement
{,}             // empty tuple/struct
{a} {a,b} {...} // tuple with N elements
{.a = x, ...}   // struct
{...;}          // block -> void
label: {...}    // block

const a = [_]u8{1, 2, 3};
const a: [3]u8 = {1, 2, 3};
const a = @as([3]u8, {1, 2, 3});
const a = [_]u8{};  // error?
const a = [_]u8{,};
const a = @as([0]u8, {});  // error!
const a = @as([0]u8, {,});

fn multiReturn() anyerror!(A, B) { ... return {a, b}; ... }
Idea No.2:

()              // empty tuple
(,)             // empty tuple
(,a)            // tuple with 1 element
(a,)            // tuple with 1 element
(a, b, ...)     // tuple with N elements

{}              // empty block (asymmetric to tuple syntax)
{,}             // empty struct
{, ...}         // struct
{...}           // block/struct
label: {...}    // block
({})            // still empty block -> void

const a: [_]u8 = (1, 2, 3);
const a = @as([3]u8, (1, 2, 3));

const a: T = {};  // error if T is not void?
const a: T = {,};
const a = @as(T, {});  // error if T is not void?
const a = @as(T, {,});

const a = [_]u8.(1, 2, 3);
const a = T.{};
↑ Rejected :P https://github.com/ziglang/zig/issues/760#issuecomment-430009129

const a = [_]u8{1, 2, 3};
const a = T{};
↑ Don't worry? https://github.com/ziglang/zig/issues/5038

fn multiReturn() anyerror!(A, B) { ... return (a, b); ... }

jakwings avatar Apr 15 '20 03:04 jakwings

A really cool but probably bad option is to define the single value of void as the empty struct. This would remove the ambiguity since it doesn't matter if {} is a block or an empty struct, they both evaluate to the value of void! But this could cause a lot of other weirdness, like var x: Struct = voidFn(); causing x to be default-initialized, so it's probably not something we should do.

SpexGuy avatar Apr 15 '20 05:04 SpexGuy

I suggest to reconsider about this proposal.

As @andrewrk noticed in 0.6.0 release note, 0.7.x maybe the last chance to make a bigger change in zig to polish this language more elegant, intuitive, simpler. There are a lot of proposals related to this:

#4294 always require brackets, so it reduces use cases about block and resolves issue #1347. Also related to #5042, #1659 which can remove () used in if/while/for/switch.

#4412 new keyword seqblk for labeled blocks, may be block should use this kind of syntax.

#4170 shows in order to keep consistency, anonymous funtion literal has a weird syntax.

#5038 removes T{}, also related to #4847, array default initialization ayntax.

#4661 remove bare catch, related to require brackets and examples below.

IMO we should take all these things into account. Here's the ideally syntax what I prefer to:

// named block
const x = block blk {
    var a = 0;
    break :blk a;
};

block outer while {
    while {
        continue :outer;
    }
}

// unnamed block
block {
    var a = 0;
    assert(a+1 == 1);
}

// switch
const b = switch {
    a > 0 => 1,
    else => 0,
}

const b = switch var a = calc() {
    a == 0 || a == 1 => 0,
    else => 1,
};

switch var a = calc(); a {
    .success => block {
        warn("{}", {a});
    },
    else => @panic(""),
}

// while
var i: usize = 0;
while {
    i += 1;
    if i < 10 { continue; }
    break;
}
assert(i == 10);

var i: usize = 1;
var j: usize = 1;
while i * j < 2000; i *= 2, j *= 3 {
    const my_ij = i * j;
    assert(my_ij < 2000);
}

const x = while var i = 0; i < end ; i += 1 {
    if i == number { break true; }
} else { false }

while var i = 0; getOptions(i) => v; i += 1 {
    sum += v;
} else {
    warn("", {});
}

// for
for seq => v {
    warn("{}", {v});
}

// if
if var a = 0; a != b {
    assert(true);
} else if a == 9 {
    unreachable;
} else {
    unreachable;
}

const a = if b { c } else { d };

const x: [_]u8 = if a => value {
    {value, 1, 2, 3, 4}
} else block blk {
    warn("default init", {});
    break :blk {0, 1, 2, 3, 4};
}

// error
const number = parseU64(str, 10) catch { unreachable };

const number = parseU64(str, 10) catch { 13 };

const number = parseU64(str, 10) catch block blk {
    warn("", {});
    break :blk 13;
};

const number = parseU64(str, 10) catch err switch err {
    else => 13,
};

fn foo(str: []u8) !void {
    const number = parseU64(str, 10) catch err { return err };
}

if parseU64(str, 10) => number {
    doSomethingWithNumber(number);
} else err switch err {
    error.Overflow => block {
        // handle overflow...
    },
    error.InvalidChar => unreachable,
}

errdefer warn("got err", {});

errdefer err if errFormater => f {
    warn("got err: {}", {f(err)});
}

// tuple & anonymous literals
var array: [_:0]u8 = {11, 22, 33, 44};

const mat4x4: [_][_]f32 = {
    { 1.0, 0.0, 0.0, 0.0 },
    { 0.0, 1.0, 0.0, 1.0 },
    { 0.0, 0.0, 1.0, 0.0 },
    { 0.0, 0.0, 0.0, 1.0 },
};

var obj: Object = {
    .x = 13,
    .y = 67,
    .item = { 1001, 1002, 1003 },
    .baseProp = { .hp = 100, .mp = 0 },
};

mogud avatar Apr 15 '20 07:04 mogud

@SpexGuy Similar option: syntactically {} --> block/struct/tuple but void value -/-> block/struct/tuple? something else to consider...

fn baz() void {}  // ok

// expect return_type but found anonymous struct/list literal {}
fn baz() {} {}

// foo({}) --> bar is always a struct and never a void value?
// foo(@as(void, undefined)) for the rescue
fn foo(bar: var) var { ... }

// any other place to disallow empty blocks?
printf("{}", {})  // error: Too few arguments

@mogud I haven't read all those issues but the design looks quite messy...

while{}  // instead of while true{} or loop{} (yet another keyword!)

// how about "switch true {...}"?
// it duplicates the function of if/else
const b = switch {
    a > 0 => 1,
    else => 0,
}

// mind-blown by the use of ";"
switch var a = calc(); a {...}
if var a = 0; a != b {...}
while i * j < 2000; i *= 2, j *= 3 {...}
while var i = 0; i < end ; i += 1 {...}

// different use of "=>" from "switch"
if parseU64(str, 10) => number {...}
for seq => v {...}
while var i = 0; getOptions(i) => v; i += 1 {
    sum += v
} else {
    // why no semicolon?
    warn("", {})
}

const x: [_]u8 = if a => value {
    // disallow multiple statements?
    {value, 1, 2, 3, 4}
} else block blk {
    warn("default init", {});
    // can it be just {0,1,2,3,4} (without semicolon)?
    break :blk {0, 1, 2, 3, 4};
}

const x = while var i = 0; i < end ; i += 1 {
    if i == number { break true; }
} else false;  // why no brackets?

jakwings avatar Apr 15 '20 12:04 jakwings

@iology sorry for those type mistakes, I've edited the post.

const x: [_]u8 = if a => value {
    // disallow multiple statements?
    //      -> no, this is and only can be a single expression
    {value, 1, 2, 3, 4}
} else block blk {
    warn("default init", {});
    // can it be just {0,1,2,3,4} (without semicolon)?
    //      -> no, only catch/if/else used as expression can have a single expression within `{}`.
    break :blk {0, 1, 2, 3, 4};
}

mogud avatar Apr 15 '20 13:04 mogud

@mogud I truly do appreciate what you're doing here, but I don't think the syntax is going to go that direction.

andrewrk avatar Apr 15 '20 17:04 andrewrk

That's ok, zig itself is more important for me. :)

mogud avatar Apr 15 '20 17:04 mogud

As a newcomer to Zig I would also encourage not dropping this issue (syntax of tuples - or anonymous structs?). I have spent many hours being confused by the current syntax (it seems kind of unique to zig, and not in a good way) and trying to find out how to loop over const x = .{1, "string", void} or access specific elements (keep wanting to write x[1]).

mlawren avatar Mar 11 '21 10:03 mlawren

Let us re-evaluate this in light of #14523.

andrewrk avatar Feb 03 '23 20:02 andrewrk

My opinion on this is that the status quo syntax should remain. While I'm very glad that ZON as a concept has made it in, I strongly believe the Zig syntax should be tailored first and foremost to the source language, provided it doesn't cause significant usability issues in ZON. In this case, I believe that holds; the leading . doesn't actually make ZON any harder to write or read (the single . keystroke is very insignificant, and in terms of reading at least I visually parse .{ as one thing, so it doesn't hinder me at all). In terms of the language, I find the arguments given previously apply.

Blocks and struct literals are fundamentally quite different, and it doesn't make sense to me to unify them. For instance, in self-hosted, we specifically removed support for .{} to initialize a void value (which was a bug in stage1), which makes sense because they represent quite different things! .{} isn't a "catch all initializer", it's specific to aggregates, so it doesn't really make sense for it to also initialize void values. Going the other way, if we tried to simply get rid of the . from the current syntax, empty blocks and empty initializers both become {}, which in my eyes is even worse because this looks (at least to me) more like a block than an initializer (despite the latter likely being a far more common use).

Another problematic case is singleton tuples. These are used quite frequently, for instance in many uses of std.fmt, but just removing the leading . would make them look very similar to single-statement blocks ({func()} vs {func();}). The difference between these expressions is quite subtle, which in my eyes is a fairly confusing property; you don't know if what you're reading is a block or a tuple until you reach the end of the first expression/statement. It's true that the usage is normally obvious from context, but that only makes it more confusing in any rare cases where it's not.

The only way I would be in support of a syntax change here would be if instead of simply dropping the leading ., we replaced the initializer syntax with something else entirely, such as [ .x = 3, .y = 4 ] (therefore making [] an empty initializer). I'm not sure if there are any problematic cases in the grammar, and I wouldn't explicitly support such a change (I happen to quite like the current .{ } syntax, since it makes sense for me to replace the type name with a single token to represent "infer this"), but that solution feels to me much more favorable than outright removing information that helps both the parser and humans reading code to quickly understand what they're looking at.

mlugg avatar Feb 03 '23 22:02 mlugg

I fully agree with @mlugg . In the pas few weeks, I have been "teaching" zig to co-workers which all found the .{ syntax very weird at first, but once they understood that it means current context.{ it "clicked" and it seems a well thought concept. It is in line with switch(foo) { .bar kind of statements and the idea that "the dot with nothing before it" is the "current thing of whatever makes sense" is quite natural.

I mention that because we often have the comment that this syntax is weird and could make the language adoption harder, but it is not weird, it is uncommon, but I really think it is the right syntax.

Also as it was pointed out, there are edge cases like tuples which could turn this into something complicated to implement.

Finally, I'll add that I write quite a lot of elixir, where the map (which are used for struct) syntax is %{...}, and this allows for syntax highlighting and ligature which can both help differentiate them from tuples ({} is tuple in elixir). .{ could someday make it into some ligature nerd font, and it can be highlighted in another color today! (for those who likes colors)

kuon avatar Feb 04 '23 01:02 kuon

It seems that if you have experience in zig already then everything looks fine for you, but as a newcomer I need to say these random thoughts:

  1. Syntax .{} is ugly aesthetically, not that much as <> for generics tho :)

  2. Dots for struct-fields annoy, too much noise .{ .name = "Sam", .age = 11 }

  3. In JS there is a problem returning object from arrow functions and it's solved by wrapping in ( ... ):

     somefn(() => { name, age }); // doesn't work
     somefn(() => ({ name, age })); // works
    
  4. In JS there is a shorthand for objects. I remember how weird it looked for me when was introduced, but afterall it's really handy.

     const name = "Sam"; 
     const age = 11; 
     const obj = { name: name, age: age }; 
     const obj2 = { name, age };
    
  5. [ ... ] literals for array-like structures is also cuter than { ... }

Phew, I'm sure this was already discussed million times but I just needed to say all this somewhere 🤣

deflock avatar Feb 04 '23 02:02 deflock

My humble opinion is that the outermost dot and brackets .{} are fine, but I agree with @deflock's second point. The dot before the outermost bracket should be enough to distinguish anonymoys struct from a block, and the dots before struct fields are redundant.

shanoaice avatar May 04 '23 07:05 shanoaice

agree with @shanoaice .

although might be a little off topic, i wonder if we could get rid of the . in dereferencing and unbox optional as well. for dereferencing (.*), it may lead to misunderstanding (with ** maybe? but it could make clear with spaces).

So1aric avatar Jun 20 '23 02:06 So1aric

I heavily dislike that syntax idea; I happen to find the analogy with field access quite nice, especially for optionals where the payload more-or-less is a field. But more to the point, even without the issue of **, that's ambiguous; is x*-3 multiplying x by the integer literal -3 (x * -3), or is it dereferencing x and subtracting 3 (x* - 3)? Spacing can't clear this one up - both operators are single characters.

mlugg avatar Jun 20 '23 02:06 mlugg

A really cool but probably bad option is to define the single value of void as the empty struct. This would remove the ambiguity since it doesn't matter if {} is a block or an empty struct, they both evaluate to the value of void! But this could cause a lot of other weirdness, like var x: Struct = voidFn(); causing x to be default-initialized, so it's probably not something we should do.

I actually like that idea. What if we restrict it to coercion of {} literals to the type void? In other words, {} can coerce to

[_]T
// or
void

However, without any type coercion, the type of {} will default to void. Thus,

stdout.writer().print(
    "No ambiguity?\n",
    {} // coerced to array
) catch {}; // void

fn func(array: [0]u8, x: void) void {
    // …
}
const a: [_]u8 = {}; // coerced to array
const b: void = {}; // void
const c = {}; // void

// fn(array, void)
func(a, b);
func({}, b);
func(a, {});
func({}, {});


switch (a.len) {
    0 => {}, // void
    else => unreachable,
}

RaphiSpoerri avatar Jul 20 '23 01:07 RaphiSpoerri

Alternatively, the big cases where {} should definitely be parsed as a block are:

  • After control flow: if, else, for, etc.
  • After => in switch statements
  • After labels

I think that covers about 99.9% of the cases. We could make {} a block in these cases, and an empty structure/union/array otherwise.

RaphiSpoerri avatar Jul 20 '23 02:07 RaphiSpoerri

The current syntax is a lot clearer to read.

iacore avatar Sep 10 '23 23:09 iacore

If we represent tuples as having at least one comma {,} (empty tuple, without the . prefix), and {} as an empty block -> void, then that solves both of @SpexGuy initial edge cases.

That representation also aligns with how Python defines its tuples; (1): int while (1,): tuple (for a tuple to be recognized as such, it must contain at least one comma; a collection of comma separated values).

Currently, .{1,} and .{1} are equivalent. The above representation is also consistent and aligns well when letting Zig infer the size of the tuples: [_]u8{1, 2, ..., n} where it's not prefixed with a ..

And @deflock 2nd point is quite valid.

OSuwaidi avatar Apr 07 '24 03:04 OSuwaidi

@OSuwaidi
{,} is ugly.

RaphiSpoerri avatar Apr 07 '24 16:04 RaphiSpoerri

@RaphiSpoerri Well, at least it clears up both ambiguities, is consistent, and people find .{} confusing/unconventional as well.

I'm sure Andrew initiated this issue in the first place for a reason.

OSuwaidi avatar Apr 07 '24 22:04 OSuwaidi