csharplang icon indicating copy to clipboard operation
csharplang copied to clipboard

Proposal: Expression blocks

Open cston opened this issue 4 years ago • 270 comments

Proposal

Allow a block of statements with a trailing expression as an expression.

Syntax

expression
    : non_assignment_expression
    | assignment
    ;

non_assignment_expression
    : conditional_expression
    | lambda_expression
    | query_expression
    | block_expression
    ;

block_expression
    : '{' statement+ expression '}'
    ;

Examples:

x = { ; 1 };  // expression block
x = { {} 2 }; // expression block

y = new MyCollection[]
  {
      { F(), 3 }, // collection initializer
      { F(); 4 }, // expression block
  };

f = () => { F(); G(); }; // block body
f = () => { F(); G() };  // expression body

Execution

An expression block is executed by transferring control to the first statement. When and if control reaches the end of a statement, control is transferred to the next statement. When and if control reaches the end of the last statement, the trailing expression is evaluated and the result left on the evaluation stack.

The evaluation stack may not be empty at the beginning of the expression block so control cannot enter the block other than at the first statement. Control cannot leave the block other than after the trailing expression unless an exception is thrown executing the statements or the expression.

Restrictions

return, yield break, yield return are not allowed in the expression block statements.

break and continue may be used only in nested loops or switch statements.

goto may be used to jump to other statements within the expression block but not to statements outside the block.

out variable declarations in the statements or expression are scoped to the expression block.

using expr; may be used in the statements. The implicit try / finally surrounds the remaining statements and the trailing expression so Dispose() is invoked after evaluating the trailing expression.

Expression trees cannot contain block expressions.

See also

Proposal: Sequence Expressions #377 LDM 2020-01-22 https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions

cston avatar Jan 07 '20 19:01 cston

{ F(); 4 }, // expression block

In terms of impl, this will be a shockingly easy mistake to make (i do it all the time myself). We shoudl def invest in catching this and giving a good message to let people know what the problem is and how to fix it. i.e. if we detect not enough expr args, oing in and seeing if replacing with a semicolon with a comma would fix things and pointing peoplt to that as the problem.

CyrusNajmabadi avatar Jan 07 '20 19:01 CyrusNajmabadi

Control cannot leave the block other than after the trailing expression unless an exception is thrown executing the statements or the expression.

Is this for ease of impl, or is there a really important reason this doesn't work at the language level? for example, i don't really see any issues with continuing (to a containing loop) midway through one of these block-exprs.

CyrusNajmabadi avatar Jan 07 '20 19:01 CyrusNajmabadi

I also don't see the reasons for any of the restrictions TBH, other than expression trees.

YairHalberstadt avatar Jan 07 '20 19:01 YairHalberstadt

Control cannot leave the block other than after the trailing expression unless an exception is thrown executing the statements or the expression.

Is this for ease of impl, or is there a really important reason this doesn't work at the language level.

The evaluation stack may not be empty at the continue.

int sum = 0;
foreach (int item in items)
{
    sum = sum + { if (item < 3) continue; item };
}

cston avatar Jan 07 '20 19:01 cston

The evaluation stack may not be empty at the continue.

Riht... but why would i care (as a user)? From a semantics perpective, it just means: throw away everything done so far and go back to the for-loop.

I can get that this could be complex in terms of impl. If so, that's fine as a reason. But in terms of hte language/semantics for the user, i dont' really see an issue.

CyrusNajmabadi avatar Jan 07 '20 19:01 CyrusNajmabadi

@CyrusNajmabadi as a user I find the example by @cston hard to grok. Yanking the whole conditional statement out of the expression block makes everything MUCH clearer. Do you have a counterexample where return, break or continue work better inside an expression block?

orthoxerox avatar Jan 07 '20 20:01 orthoxerox

In terms of impl, we should look at the work done in TS here. in TS { can start a block, or it can start an object-expr. Because of this, it's really easy to end up with bad parsing as users are in the middle of typing. It important from an impl perspective to do the appropriate lookahead to understand if something should really be thought of as an expression versus a block.

CyrusNajmabadi avatar Jan 07 '20 20:01 CyrusNajmabadi

Consider the following:

{ a; b; } ;

A block which executes two statements inside, with an empty statement following.

{ a; b };

An expression-statement, whose expression is a block expression, with a statement, then the evaluation of 'b'.

Would we allow a block to be the expression of an expr-statement? Seems a bit wonky and unhelpful to me (since the value of hte block expression would be thrown away).

Should we only allow block expressions in the case where the value will be used?

CyrusNajmabadi avatar Jan 07 '20 20:01 CyrusNajmabadi

@cston To avoid the look-ahead issue, I would suggest an alternative change:

block
  : '{' statement* expr '}'
  ;

This means that we always parse { ... as a block, even if it has a trailing expression. Then we can disallow in semantic layer.

I think this would solve the look-ahead issue for the compiler, but not so much for humans. I'd still favor @{, ${ or ({ to indicate this is an expression-block.

jcouv avatar Jan 07 '20 20:01 jcouv

${

yes. I'm very on board with a different (lightweight) sigil to indicate clearly that we have an expr block

CyrusNajmabadi avatar Jan 07 '20 20:01 CyrusNajmabadi

How about ={ 😁

HaloFour avatar Jan 07 '20 21:01 HaloFour

I wonder if the ASP.NET team would lean their preference to @{ since that's already established for a statement block in Razor syntax. 🍝

Joe4evr avatar Jan 07 '20 21:01 Joe4evr

Isn't that a good reason not to use it, then, as it may cause parsing issues in a Razor/Blazor page?

On Tue, 7 Jan 2020 at 21:39, Joe4evr [email protected] wrote:

I wonder if the ASP.NET team would lean their preference to @{ since that's already established for a statement block in Razor syntax. 🍝

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/3086?email_source=notifications&email_token=ADIEDQLRWL7SSRWNJ7IZWSDQ4TY73A5CNFSM4KD5XJAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIKL6VY#issuecomment-571785047, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIEDQKNNDPOSBT2TKGFOMTQ4TY73ANCNFSM4KD5XJAA .

spydacarnage avatar Jan 07 '20 21:01 spydacarnage

This is kinda neat but the syntax definitely bothers me as being too subtle of a difference for block vs expression. I think having $ as a prefix is more sensible and easier to recognize when reading.

mikernet avatar Jan 07 '20 22:01 mikernet

I'm not bothered by the semicolon, but understand the potential confusion.

Also, if I undestand correctly, it will not be possible to simply relax the syntax and let the compiler decide whether the block is statement / expression due to lambda type inferrence. Correct?

Trayani avatar Jan 07 '20 22:01 Trayani

I think this is really promising, and a good starting point.

We've been circling around the possibility of being able to add statements inside expressions for many years. I like the direction of this proposal, because:

  • the {...} is recognizable from statement blocks. I know that curly braces are already somewhat overloaded, and there will be ambiguous contexts, but from a cognitive perspective I think it doesn't make the situation significantly worse, and is preferable to adding some new syntax for statement grouping.
  • It provides natural and easy-to-understand scoping for any variables declared inside, including those declared in the trailing expression (e.g. through out variables).

Within that, I think there are several design discussions for us to have:

  • Should the result be produced by a single expression at the end (as proposed here), or via a result-producing statement (e.g. break expr; has been proposed in #3037 and #3038)? In the latter case it would be syntactically equivalent to a block statement, and just have different semantic rules (just as the difference between a block used for a void-returning vs a result-returning method). The former may work best for shorter blocks, the latter for bigger ones. Which should we favor?

  • Is the proposed precedence right? This disallows any operators from being applied directly to the statement expression. That's probably good, but needs deliberation. It limits the granularity at which an expression can easily be replaced with a block (though of course you can always parenthesize it, like every other low-precedence expression).

  • Should a block expression be allowed as a statement expression? probably not!

  • The proposal requires there to be at least one statement. That's kind of ok if the statement block is used for prepending statements to your expression! But once it's in the language I can imagine wanting to use it just to scope variables declared in a single contained expression.

  • I don't like the proposals for prepending a character so that you can "tell the difference", but that's another discussion to have. I don't think anyone other than the compiler team wants to "tell the difference". 😁

  • There's a potential "slippery slope" argument to allow other statement forms as expressions somehow. I don't think that's very convincing, since such statements should just be put inside of a block expression! But I can see that coming up.

  • We should make sure we gather the important scenarios. I've heard two really convincing ones:

    • as the branches of switch expressions (and switch statements if we do #3038). Switch expressions are themselves so complex that reorganizing the code to get a statement in becomes intrusive.
    • as "let-expressions" where a temporary local variable (or function) is created just for the benefit of one expression.

    #3037 has examples of the former. An example of the latter might be:

    var length = { var x = expr1; var y = expr2; Math.Sqrt(x*x + y*y); }
    

At the end of the day, this is the kind of feature that, even when we've done the best we can on designing it, it just doesn't feel right and we end up not doing it. Putting statements inside expressions may just fundamentally be too clunky to be useful.

MadsTorgersen avatar Jan 08 '20 01:01 MadsTorgersen

Allow a block of statements with a trailing expression as an expression.

I'd love it if this were possible without requiring a modified syntax. Sure, I understand that this would change the meaning of existing code, but most of the time that change would be that a value is harmlessly discarded. I am aware of at least one situation where this could affect overload resolution for lambda expressions, are there others?

HaloFour avatar Jan 08 '20 03:01 HaloFour

If I'm understanding the proposal correctly this would feel very weird when used with expression-bodied members.

class A 
{
    int Foo() => 5; //Expression

    int Foo2() => { ; 5 } //Expression block?

    int Foo3() => { return 5; } //Not allowed
}

MgSam avatar Jan 08 '20 04:01 MgSam

@MgSam that's what Mads is pointing out with "Should a block expression be allowed as a statement expression? probably not!"

333fred avatar Jan 08 '20 05:01 333fred

If statement expressions are added, and "Control cannot leave the block other than after the trailing expression", there's increased incentive to make conditionals more user friendly, so that it's easier for the result of a block expression to depend on a test.

I find deeply nested conditional expressions highly unreadable. This suggests that we should allow if-else expressions.

This also cuts the other way. With sequence expressions it's much easier to turn an if-else with multiple statements into an expression. All you have to do is remove the final semicolon

In scala and rust it's common for the entirety of a method to consist of a single expression consisting of multiple nested if-else expressions. I find this to be a really nice style.

YairHalberstadt avatar Jan 08 '20 08:01 YairHalberstadt

If I understand correctly the main motivation of this proposal is only switch statements #3038.

Really I don't see another value benefits from this, much more desirable for me it is something like with operator.

Consider slightly changed @MadsTorgersen example

var length =
{
    var (x, y) = (GetX(), GetY());
    Math.Sqrt(x*x + y*y);
}

much more clear and obvious for me

var (x, y) = (GetX(), GetY());
var length = Math.Sqrt(x*x + y*y);

or hide variables into functional scope

double CalculateDistance(double x, double y) => Math.Sqrt(x*x + y*y);
var length = CalculateDistance(GetX(), GetY());

So, from this point Expression blocks looks for me like a local function body without signature and parameters called immediately

double CalculateDistance()
{
    var (x, y) = (GetX(), GetY());
    return Math.Sqrt(x*x + y*y);
}
var length = CalculateDistance();
var length =
{
    var (x, y) = (GetX(), GetY());
    return Math.Sqrt(x*x + y*y); // it should contains explicit 'return'
}

But I am not sure that this is really important and value feature...

0x000000EF avatar Jan 08 '20 08:01 0x000000EF

@0x000000EF

The expression block can take place in a deeply nested expression, where converting it to a set of statements would require significant refactoring.

YairHalberstadt avatar Jan 08 '20 08:01 YairHalberstadt

I think this would solve the look-ahead issue for the compiler, but not so much for humans. I'd still favor @{, ${ or ({ to indicate this is an expression-block.

I think { is good enough. We can always parenthesize it as ({ when needed.

If I understand correctly the main motivation of this proposal is only switch statements #3038.

Really I don't see another value benefits from this, much more desirable for me it is something like with operator.

Ternary operator and object initialization will benefit from this too.

var grid = new Grid {
    Children = {
        ({
            var b = new Button { Text = "Click me" };
            Grid.SetRow(b, 1);
            b
        })
    }
};

ronnygunawan avatar Jan 08 '20 08:01 ronnygunawan

@YairHalberstadt, can you provide an example?

@ronnygunawan, seems looks more clear...

Button CreateClickMeButton()
{
    var b = new Button { Text = "Click me" };
    Grid.SetRow(b, 1);
    return b;
}

var grid = new Grid {
    Children = {
        CreateClickMeButton()
    }
};

0x000000EF avatar Jan 08 '20 09:01 0x000000EF

@0x000000EF When building deeply nested UIs using code it is often desirable to have the elements declared right where they are in the tree, not split off somewhere else. It mirrors the equivalent XAML/HTML/etc more closely and it's easier to reason about the structure of the UI.

@MadsTorgersen

I don't think anyone other than the compiler team wants to "tell the difference"

I'm not sure what you mean by that. I think it's useful to be able to reason about the difference in behavior between...

f = () => { F(); G(); }; // block body
f = () => { F(); G() };  // expression body

...with something less subtle than just the absence of the semicolon, particularly if the proposal to implicitly type lamdas to Action/Func in the absence of other indicators gains traction. I guess the stylistic nature of the second example just feels a bit odd to me in the context of C# but maybe with time I'd get over that. A keyword before the trailing expression would solve that minor gripe as well but I'm not overly invested either way, just a suggestion to consider.

mikernet avatar Jan 08 '20 09:01 mikernet

@mikernet, it is not a big problem if we have something like with operator

static T With<T>(this T b, Action<T> with)
{
    with(b);
    return b;
}

var grid = new Grid {
    Children = {
        new Button { Text = "Click me" }.With(b => Grid.SetRow(b, 1))
    }
};

0x000000EF avatar Jan 08 '20 09:01 0x000000EF

I'm definitely not a fan of seeing something like ({ Foo(); 3;}). I really dislike the 3; part. I would much rather see something like:

@{
  Foo();
  return 3;
}

where the block expression basically looks and acts a lot more like a delegate/local function.

TonyValenti avatar Jan 08 '20 12:01 TonyValenti

For what it's worth, I'd prefer a variant of this over #377 (Sequence Expressions). Assuming its scoping behaves exactly as a normal block's.

Also one more vote for the (1) { *; break expr; } syntax instead of the (2) { *; expr } syntax. (2) looks somewhat more elegant, but it clashes with the style of the rest of the language. Does that makes sense?

munael avatar Jan 08 '20 13:01 munael

@TonyValenti,

I really dislike the 3; part

Which goes to prove the "you can't please all of the people..." adage. That form, { Foo(); 3 } is exactly the selling point for me. Require a break or return or whatever and it's now a strange half statement block/ half inline function that can weirdly appear in an expression.

DavidArno avatar Jan 08 '20 13:01 DavidArno

Not sure how the scoping of variables declared within an expression block would work, but supporting the following scenario seems a key requirement of this feature as it's one of the primary drivers for extending expressions in this way:

C Foo()
{
    return new C { 
        Prop1 = { var x = ExpensiveMethod(); x.P1 }, 
        Prop2 = x.P2 
    };
}

so x needs to "leak" out of its expression block into the initialiser.

DavidArno avatar Jan 08 '20 13:01 DavidArno