language Switch expressions with some cases that need a body

Switch expressions with some cases that need a body

Open TimWhiting opened this issue 1 year ago • 55 comments

Switch expressions are really useful, however, In the past few days of working with them frequently I've run into the problem where most of my bodies of the cases are expressions, but I need a proper body for the minority of them. In such cases I have to revert everything to a switch statement (it's good that there is an assist for it), or I have to create an immediately invoked function. This is poor user experience.

return switch (this) {
      Ref() => k(this),
      // ... etc
      Lambda(:final formals, :final body) => (){
         //... Some code
         return x;
      }(),
    };

My proposal would be to allow a body instead of an expression in switch expressions. The body has the caveat that it cannot use a break / continue, and has to have a return statement. So the previous example would change to this: The semicolon of the return statement then separates the cases.

return switch (this) {
      Ref() => k(this),
      // ... etc
      Lambda(:final formals, :final body):
         //... Some code
         return x;
    };

May 11 '23 15:05 TimWhiting

The fact that were using => makes me really wish that we could write:

final a = switch (obj) {
  pattern {
    return 42;
  } 
}

(with or without a ":" after the pattern)

I often switch back and forth between => and {return} depending on what's most readable for the given situation.

May 11 '23 15:05 rrousselGit

That would be most consistent with function body / function expression bodies, which is actually probably more consistent for the feature than using switch statement body syntax.

I don't know which would fit into the grammar better.

May 11 '23 15:05 TimWhiting

We have had some proposals about a construct which is often called 'block expressions'. Cf. https://github.com/dart-lang/language/issues/2848#issuecomment-1431991492.

They could do the job, and they amount to the same syntax except for the parentheses:

  return switch (this) {
    Ref() => k(this),
    // ... etc
    Lambda(:final formals, :final body) => {
        //... Some code
        return x;
      },
  };

The block expression is quite similar to the function literal application () { /*my code here*/ } (), but there is no function object at run time, which means that there is no run-time cost as there would be with the function literal.

May 11 '23 15:05 eernstg

Yes, I do think that this featured could be more general as in your linked issue.

However, since that feature will not be available for awhile most likely, I think it would be good to support for a refactoring assist to convert an expression to an immediately invoked function expression.

May 11 '23 16:05 TimWhiting

I agree this is an annoying wart of the syntax.

If I had a time machine, Dart would have always been an expression-based language. But I don't, and it isn't, so I tried to fit patterns and switch expressions in as gracefully as I could given that Dart already does make a distinction between expressions and statements.

Something like block-expressions could help, though it makes me wonder if we should consider just trying to actually do the whole thing and make the language expression-oriented.

May 11 '23 23:05 munificent

If you look at the changes recently, they are heavily expression oriented:

Collection elements Records Patterns

The first two work in a statement oriented language, because collection elements are only useful in expression position, records are part of patterns to a large degree, but as just a datastructure they are mostly a anonymous constructor expression. Patterns as you mentioned is really hard to put into a statement oriented language. I think it is great that it is finally here. I wouldn't go back to pre dart 3.0, but the experience has yet to feel polished in my opinion. This can and I believe will come over time, however I wonder if there will need to be some larger breaking changes as far as syntax goes to make it actually feel polished.

Something like block-expressions could help, though it makes me wonder if we should consider just trying to actually do the whole thing and make the language expression-oriented.

I'm curious what you envision for this? Is a larger breaking change with an automated syntax migration needed in your opinion to make dart an expression first language?

May 12 '23 01:05 TimWhiting

They could do the job, and they amount to the same syntax except for the parentheses:
  return switch (this) {
    Ref() => k(this),
    // ... etc
    Lambda(:final formals, :final body) => {
        //... Some code
        return x;
      },
  };

My two cents on this is that it looks a bit weird.

=> {} is reminiscent of JS. It's not something we see in Dart.

It'd be quite confusing imo

May 12 '23 03:05 rrousselGit

Like any other Dart function syntax is expressed this can folllow the same. the user should be able to write either of below Lambda(:final formals, :final body) { //... Some code return x; }, OR Lambda(:final formals, :final body) => x,

May 14 '23 04:05 braj065

Lambda(:final formals, :final body) { //... Some code return x; }, OR Lambda(:final formals, :final body) => x,

Yes, I think that would work, though we have to be careful that it doesn't run into an ambiguity around map patterns or get in the way of other pattern syntax we might want to add.

The harder question is what the semantics of that block body are. I don't think return in the middle of a block should mean "yield this value from the block". Users expect return to return from functions and I think that's a good property to maintain.

I would be tempted to do something Rust-like and say that a block yields the value of the last statement in it (which would be void if the last statement wasn't an expression statement or some other statement that yields a value like a nested block).

But making the language expression oriented like this would be a big change, and I have no idea if there is a coherent design that would hold together and, if so, if we could actually ship it and migrate the world onto it without too much chaos.

May 18 '23 21:05 munificent

I humbly propose the do-give expression:

var x = do {
  // ...
  give y;
};

The give keyword would immediately break out of the innermost do expression. give would only be considered a keyword when syntactically within a do-expression (for backwards compatibility).

In the switch example:

  return switch (this) {
    Ref() => k(this),
    // ... etc
    Lambda(:final formals, :final body) => do {
      //... Some code
      give x;
    },
  };

Mar 05 '24 21:03 dgreensp

I humbly propose the do-give expression:
var x = do {
  // ...
  give y;
};
The give keyword would immediately break out of the innermost do expression. give would only be considered a keyword when syntactically within a do-expression (for backwards compatibility).

In the switch example:
  return switch (this) {
    Ref() => k(this),
    // ... etc
    Lambda(:final formals, :final body) => do {
      //... Some code
      give x;
    },
  };

To reuse syntax, we could use yield instead.

Mar 05 '24 22:03 mateusfccp

To reuse syntax, we could use yield instead.

I guess the first question is, can you return or yield from a do-expression? It certainly could be allowed; the semantics are easy to imagine. Some languages (like Kotlin) even make return itself an expression, making it normal to return from the middle of an expression.

If return and yield are not allowed (with their usual semantics) inside a do-expression, the question becomes whether it is confusing to see one of those keywords used for exiting a do-expression. It was already commented that using return is not desirable, and the same objection could apply to yield.

Mar 06 '24 00:03 dgreensp

I would say we shouldn't allow yield inside of a do block. IMO do blocks should be only to "build" a complex expression, and shouldn't have side-effects. Although we can't control, for instance, if a function called inside a do block causes side-effects (I'm actually fond of languages that are explicit about side-effects, but this would be a major change in Dart's paradigm), we could have some controls regarding it, for instance:

disallow yield and return (disregarding the question about whether it would be confusing to repurpose the keywords)
disallow calling void functions
disallow discarding values (i.e. if one calls a function, the returned value must be used)

I can see how these restrictions may cause other problems. For instance, if one want to build a list inside the do block, they are forced to use a collection-for instead of a for+list.add.

Between return and yield, I think the later is semantically more appropriate, and less confusing, but one can, indeed, argue that this may be confusing inside sync* functions.

Mar 06 '24 12:03 mateusfccp

pattern => do { ... } feels horrifying to me.

Mar 06 '24 13:03 rrousselGit

If we add a way to have statements inside expressions, we can indeed choose to either allow or disallow local control flow statements likereturn or break.

I'd personally allow them. It can make some things harder to reason about, but writing opaque code isn't hard today, you're just expected not to do it. But sometimes a control flow in the middle of an expression is precisely what you need to express the logic concisely. I can definitely see something like .. foo (expression ?? return null) ... being useful, with the alternative being

var value = expression;
if (value == null) return null;
... foo(value) ...

If we disallow control flow operators in statement expressions, we can consider also disallowing them in finally blocks. That's another place where doing explicit control flow can get hard to reason about.

We should not allow statements only in switch expressions. Then people will write switch (0) {_ => stmts} to get statements inside any expression. Might as well make it general (maybe with slightly shorter syntax for switch expression branches).

If we can agree on behavior, I think it will be possible to find a syntax.

We can easily allow return/break/continue/rethrow as expressions, even if we don't allow more statements.

More expressions could be allowed using a block, do {…} being the most common proposal, probably because do is short (and other languages using it). Then it can either required you to exit at the end, with a trailing expression (do {stmt* expr}, or have an explicit exit-operator. I'm partial to => expr;, because then a switch expression states looking like a statement with exit operators as case bodies.

But I also want to be able to nest such statement expressions, and not necessarily exit the innermost one. We already have a syntax for naming a piece of code and exiting it early: labels and break. I've suggested break with values, to exit an expression, or labeled expression.

… (label: foo(do{stmt;if (b) break label: 42;…})

Again, of we can agree to have the functionality, we can start to look for a good syntax.

Mar 07 '24 06:03 lrhn

… (label: foo(do{stmt;if (b) break label: 42;…})

This syntax is so verbose it defeats the very purpose of the exercise IMO, which is: to allow writing short fragments of code that otherwise don't fit into a single expression (under current definition of expression). "Short" is a key here: for longer programs, the current syntax with function literals (){...}() is OK. Here's one of the ideas: support composite expressions of the form (statement; statement; ... statement) (note the parentheses, as opposed to braces)`. The expression returns the last computed value. The syntax can be very restrictive: only declarations (var, final), ifs and expressions. Fot "if", allow only the expressions in the body (no blocks). No loops, breaks, returns or anything except (composite) expressions.

The above example … (label: foo(do{stmt;if (b) break label: 42;…}) can now be written as … (stmt;if (b) 42 else 43)

Mar 07 '24 15:03 tatumizer

I think that allowing small and restricted statements, but not the full power of statements, is always going to be just short of powerful enough for what people want. Might as well make it full power.

If you just need to do one computation before another, you can use;

T seq<T>(void _, T value) => value;

and write seq(updateBananas(), banans.average). No new syntax needed for that. Allowing if "statements" that can only have a single expression in each branch, that's what the ?/: conditional expression already does. (Not that I don't want to use if instead of that, #3374).

If it's to introduce a local variable, like var firstLast = (var x = computeList(); (x.first, x.last));, then I'd prefer a proper let (#1052) or even better, a declaration as an expression (https://github.com/dart-lang/language/issues/1420).

If it's to allow local control flow using break or return, we can allow those as expressions independently of this feature. (I think we should.)

So, if we want to allow statements inside expressions, we should allow all statements.

All we need to do is agree on semantics, and then syntax. :) For semantics, any construction that allows general statements can allow return or break. We can disallow those specifically, or we can accept that an expression can now complete with a value, by throwing, or by breaking, continuing or returning. And an expression should always either evaluate to a value, or complete in one of the another ways. The one thing it cannot do, which a statement can, is to "complete normally" with no value.

For syntax, the simplest is the do { statements; expression } approach. Allowing an early escape can be nice, so do {statements;} where <emit> expression; is now a statement (or expression!) which exits the nearest enclosing do-expression with that value, can work. It introduces a new way to complete a statement: "complete with value". It's then a compile-time error if statements; can complete normally, without a value, according to flow analysis. The <emit> syntax can be any of ^ expression (Smalltalk style "return"), => expression (Dart style?), do = expression (Pascal-style return-value setting, but doesn't break?), break = expression or break: expression, or whatever we can come up with.

Being short is nice. Being able to refer to a further-out labeled do-expression is probably needed. Maybe ^label: expression, ,break label: expression, break label = expression. Which means being able to label the blocks. label: do { ... } is tricky in element context, it might have to be do label:{ .... }. Labels will be rare, so not a big issue, as long as they are there when you actually need them. (My suggestion of (label: expression) for labeled expressions also doesn't work any more, records used that syntax.)

We could allow (statement ... statement expression) as an expression. It would effectively be a block statement. It saves the do, and allows a final expression to be emitted directly. But how do we nest composite statements. Will it look odd to do: var norm = (var r = 0; for (var i in xs) { r += i * i; } sqrt(r));, probably formatted like:

 var norm = (
   var r = 0; 
   for (var i in xs) { 
     r += i * i; 
   } 
   sqrt(r));`

Using ( on the outside, but { inside is a little weird. The prefix do works by allowing a {-} body after it.

Lots of options. If we can agree on the semantics, we should be able to find a useful syntax to match it. If we don't agree on semantics, discussing syntax is moot. (And I don't agree that something like this should be limited to switches, so I'll defer to a more general "statements in expressions" feature.)

Mar 08 '24 10:03 lrhn

Lots of options. If we can agree on the semantics, we should be able to find a useful syntax to match it. If we don't agree on semantics, discussing syntax is moot.

In this case, I think I disagree on the semantics proposed in this issue in favor of something like #1052.

If I understood correctly, with this we could do (adapting from original example):

return switch (this) {
  Ref() => k(this),
  // ... etc
  Lambda(:final formals, :final body) =>
    let a = _buildAFromFormals(formals) in
    let b = _doSomethingWithAToGetB(a) in
    b.processBody(body);
};

Mar 08 '24 10:03 mateusfccp

A radical solution would be: just saying that the "block statement" (as defined in 18.1 of the spec) implicitly returns a value (thus effectively becoming a "block expression") and this "returned value" is determined exactly as in other languages supporting block expressions (there's a lot of prior art). (Essentially, it always returns the "last computed value").

var five= {
    fn_call();
    5;
}

What is the downside? I can't think of any. Right now, the construct

{
    fn_call();
    5;
}

is already legal in dart anyway, it's just very rarely used by anyone except a couple of people from dart team. What harm will be caused by saying that this construct implicitly returns 5?

If this is OK, then the example from the original post can be written as

return switch (this) {
      Ref() => k(this),
      // ... etc
      Lambda(:final formals, :final body) => {
         //... Some code
         x; // assuming x is defined by "some code"
      },
    };

Similarly, there's no harm in saying that "if-else if ... else" implicitly returns the last computed value (some restrictions apply - e.g. the case with no "else" won't implicitly return anything):

var x= if (a<b) 3; else 4; // works
var y= if (a <b) 3; // error
var z= if (a<b) { // works, rules for block expression apply here
  stmt; 
  5;
} else {
  stmt;
  3;
}

There are only two constructs that support "implicitly returned value": block expression and if-expression (plus already defined switch-expression). There's no "while"-expression or "for"-expression or anything else.

There's a small catch though: "early return" from block expression is problematic. "break value" is no good (conflicts with break label or leads to ugly syntax), and "return" means something totally different. Not a big deal IMO. (Maybe "emit" can help, but it also might become the source of confusion).

Mar 08 '24 21:03 tatumizer

If we say that a {statements;} has a type and a value, then we must be able to infer the type. Which means passing in context types. And probably assigning a type to a statement with no actual value (void, value null).

The type of a statement with type context C is:

expression-statement: type of expression with type context C.
labeled statement: type of inner statement with type context C.
if-statement:
- If has else, UP of types of each branch with context C.
- If has no else, S? where S is type of then branch with context C.
switch-statement: UP of types of every case block.
block statement: {stmt1 ... stmtN}.
- If N is zero (empty block), then type void (and value null)
- If not labeled (not a break target): type of stmtN with context C.
- If labeled and used as break target; void. (Let's not try to allow 42; break block; to "return" a value.)
do-loop: type of body with context C.
any other (for,while,try): void. (Which means not allowed in most expression contexts.)

That can work. I am worried about parsing it, for all the same reasons we don't allow an expressions-statement to start with a { character. With collection literals, it's pretty certain that this would be ambiguous. Take:

var x = {if (test) {} else {}};

Is that a statement block or a set-of-maps literal?

(The do is there for a reason, to explicitly opt in to being a set of statements.)

Mar 09 '24 11:03 lrhn

Parsing can always be solved by adding an extra symbol on the opening bracket. This would help with readability too I think.

var x = #{
  return 42;
};

And if we opt for this, I'd heavily suggest making the => on switch expressions optional in this case.

Rather than:

final a = switch (v) {
  _ => #{ return 42; }
}

We'd have one of:

final a = switch (v) {
  // Desugared to `=> #{...}`?
  _ #{ return 42; }
}

final a = switch (v) {
  // Reminiscent of switch statement, for syntax parity.
  // Then I'd suggest supporting `=>` in switch statements to be complete.
  case _: #{ return 42; }
}

Mar 09 '24 11:03 rrousselGit

var x = {if (test) {} else {}};

I think it's a contrived example. If someone wanted just an if-expression, the code would be different: var x = if (test) {} else {}. But sure, it's a matter of principle. For 1-tuple, the user has to write an extra comma: x=(1,);. A similar rule can be introduced for the 1-element set literals. Set literals are rare, single element set literals are still rarer, so the new requirement won't affect too many users.

As for "do" - for some reason, I don't like it, it just feels wrong. Especially considering that we already have "do - while" in the language, which doesn't return a value and never will. Maybe #{...} is indeed a good compromise?

@rrousselGit : there should be no "return" in your examples. Block "returns a value" not via "return", but implicitly. Unfortunately, this can create confusion.

final a = switch (v) {
  case _: #{ stmt; return 42; } // returns from the containing function!
}

would mean a totally different thing than

final a = switch (v) {
  case _: #{ stmt; 42; }
}

That's why I suggested disallowing "return" in block expressions.

@lrhn: this is problematic

If has no else, S? where S is the type of then branch with context C.

This clashes with the existing treatment of if-else in collection literals. Not sure what can be done about it.

Mar 09 '24 12:03 tatumizer

Definitely a contrived example. I expect almost all possible reasonable programs to be uniquely parsable, and the test will be contrived because it's obvious to a human reader what the author means. But we still need a computer to parse it, and do something, preferably the right thing, with every acceptable input. Parsing isn't impossible (probably), but it might get harder, which usually also means that all later changes get harder too.

Then there is the issue of readability. If the meaning of a long piece of code changes significantly whether there is s semicolon or not, then it breaks with the design principle of "make similar operations have similar syntax, and different operations look different". The first part is to make learning the language easier. The second is to make code readable. If {if(test) {}} can mean two vastly different things, a set literal or a statement block, then you need context to know what one it is. So far we've taught uses that if it starts a statement, it's a statement. Making it also, possibly, be a statement on the middle of an expression throws out that learning.

And statements in expression position will definitely clash with elements. We have three different kinds of code:

expressions, always have a value
elements, zero or more values
statements, no values.

(If we had conditional arguments, we'd cover the zero-or-over case too.)

Those are significantly different, and we can't easily use one instead of the other. At least we always need to know which one it is.

Mar 09 '24 14:03 lrhn

@rrousselGit : there should be no "return" in your examples. Block "returns a value" not via "return", but implicitly. Unfortunately, this can create confusion.

Not necessarily. That's up to us to decide what the "return" in a custom block does. Especially when using a special syntax for defining the block.

#{...} could very well be sugar for () {...}(), in which case return definitely does not quit the enclosing function.

Mar 09 '24 14:03 rrousselGit

Indeed, #{...} could very well be sugar for () {...}(), thus requiring "return", but if we intend to address if-expressions in a similar manner, then we have a problem. Is this a valid code?

var x = if (cond) {
   stmt;
   42;
} else {
  stmt;
  43;
}

Writing "return 42;" is not an option (clearly, it would be a return from the enclosing function). So, we have to introduce "implicit return" for if-expression. But as soon as we introduce implicit returns for if-expression, why do it ONLY for if-expression and not for block-expression? Another example:

var x = #{
  stmt;
 
  if (test) {  
    42; 
  } else {
    43;
  }
}

This won't work: you need to write "return if (test)..." or "return 42"/"return 43". I find it inconsistent and also inconvenient. We may, of course, live without if-expression - I just hoped we can kill two birds with one stone. But without the second bird, the case for #{} becomes very weak: just use {}{...}() instead.

@lrhn: does syntax #{...} address your concerns? There's no conflict with the set literal. (Personally, I would prefer to go without '#' marker - it just looks cleaner. Also: why does block-expression require a marker, and if-expression doesn't? )

Mar 09 '24 16:03 tatumizer

Writing "return 42;" is not an option (clearly, it would be a return from the enclosing function).

The point is that we'd use #{ in all block expressions. So:

var x = if (cond) #{
   stmt;
   return 42;
} else #{
  stmt;
  return 43;
}

There's no issue here

Or:

var x = #{
  stmt;
 
  if (test) {  
    // We quit the enclosing #{ block here
    return 42; 
  } else {
    return 43;
  }
}

And of course, we could use this in collections:

final x = [
  if (foo) #{
    statement;
    return 42;
 }
]

Mar 09 '24 16:03 rrousselGit

If we want to call a thing "an expression", then the term itself imposes some restrictions. In particular, you can't use the word "return" in the context of the expression in the same meaning as you do inside a function. Expressions like this are supported in many languages, but there, if you say "return x" inside a block, it will return from the enclosing function (not from the block). See, for example, this thread In rust, they use "break" (not "return") for the "early return", but they have a special syntax for labels, so "break 42" is not ambiguous. Again, this is not only rust - it looks like an established convention,

Mar 09 '24 18:03 tatumizer

Block/statement expressions "need" a delimiter mainly because statements can already contain expressions. If expressions can just contain statements, then we risk cycles in the grammar. A unique delimiter can make it clear and unambiguous which way we are parsing the following syntax, both to the compiler and to the human reader.

The alternative, which we have used so far, is that if something starts a statement, then it's the statement form, otherwise it's the expression form. That has worked fairly well. Then we added elements, which can also contain expressions, and again the rule is that if it can start an element, it does. Expressions are the lowest "precedence" of interpreting ambiguities. If an expression can now start with a statement, things may get weird, because now an element can start with a statement, which is new and grammatically confusing.

Delimiters makes the grammar unambiguous and gives the reader a hint about what's to come.

"If expressions", which I guess means conditional expressions like e1 ? e2 : e3, have a distinctive syntax. It's not delimiters, and that has caused us a lot of trouble over the years, as we try to use ? and : for other things too. If a conditional expression had to be parenthesized, (e1 ? e2 : e3), then it would be much, much easier to avoid ambiguity. (And you generally should parenthesize your conditional expression if it's not an expression by itself. Don't do this: isDiskOk ? discController() : discRepairController()..format().)

(You can totally use the word return in an expression with the same meaning as in a statement: Returning from the surrounding function. It's giving it a different meaning, like "returning from the surrounding expression" that gets weird.)

Mar 09 '24 19:03 lrhn

By "If expressions" I meant your proposal #2306. (I don't understand the part where you said this expression cannot start the statement). Are you sure you want to allow "return" in this expression?

var x = if (cond) 42 else return 15;

Right now, you can't use "return" in expressions. E.g. var x= cond ? 42 : return 15; is an error.

Mar 09 '24 20:03 tatumizer

There are lots of ideas floating around here. The simplest is allowing return, break, continue and rethrow as expressions. That allows some of the uses people have for statements in expressions - the bailout if there is no good value.

With that, I would allow (whether if or ?/: syntax)

int foo(bool cond) {
  var x = cond ? 21 : return 15; // aka: `int x; if (cond) x = 21; else return 15;`
  return x + x;
}

That would be the point of allowing those as expressions.

Mar 09 '24 21:03 lrhn

language language copied to clipboard

Switch expressions with some cases that need a body

language
language copied to clipboard