csharplang icon indicating copy to clipboard operation
csharplang copied to clipboard

[Proposal]: Label statements

Open stephentoub opened this issue 3 years ago • 16 comments

FEATURE_NAME

  • [x] Proposed
  • [ ] Prototype: Not Started
  • [ ] Implementation: Not Started
  • [ ] Specification: Not Started

Summary

Labels currently need to be part of statements but can't be statements on their own. We should consider making "Label:" a statement on its own.

Motivation

Source generators that utilize gotos and labels need to be aware when generating code whether a label being jumped to is at the end of a scope. If it is, emitting "Label:" will cause compilation failures, because the C# spec/compiler currently prohibit a label from being followed by a closing brace. e.g. if code is being generated for the equivalent of:

if (Condition())
{
    Work();
}

as:

if (!Condition()) goto AfterWork;
Work();
AfterWork:

the source generator now needs to either know that there will be additional code being generated after AfterWork:, track it to be able to output a semicolon if there isn't, or just always emit it as:

if (!Condition()) goto AfterWork;
Work();
AfterWork:;

which looks weird and makes for somewhat strange stepping behavior when debugging through the source-generated code. This is all because if it's emitted as:

{
    if (!Condition()) goto AfterWork;
    Work();
    AfterWork:
}

that will fail to compile.

Detailed design

The C# specification currently states:

A labeled_statement permits a statement to be prefixed by a label. Labeled statements are permitted in blocks, but are not permitted as embedded statements.

labeled_statement
    : identifier ':' statement
    ;

such that Label: is only permitted as part of "labeled statement", i.e. there has to be a statement to put the label on. We should consider allowing Label: to be a statement of its own.

Drawbacks

Alternatives

Unresolved questions

Design meetings

stephentoub avatar Nov 24 '21 14:11 stephentoub

Personally, I don't see the advantages over always putting a semicolon. Yes, it looks a little strange, but I don't really see that as an issue needing to be resolved. Could you clarify more what the debugging behavior you're seeing is? Maybe that can be resolved on its own in Roslyn without having to make a language change.

333fred avatar Nov 24 '21 14:11 333fred

I don't see the advantages over always putting a semicolon. Yes, it looks a little strange

Avoiding looking strange is an advantage. It may not be hugely compelling, but it's an advantage nonetheless ;-)

Could you clarify more what the debugging behavior you're seeing is?

The semicolon is its own statement. So you need to step through it.

The additional advantage is someone needs to learn the workaround, after being bitten by it in the first place. I had to go and find all the places this was possibly happening and update them to include semicolons, which is not a natural instinct.

stephentoub avatar Nov 24 '21 14:11 stephentoub

I don't see the advantages over always putting a semicolon. Yes, it looks a little strange

Avoiding looking strange is an advantage. It may not be hugely compelling, but it's an advantage nonetheless ;-)

This is one of the few cases where since goto is used rarely, and labels without a statement are even rarer, the cost outweighs the benefit a lot, I think.

Could you clarify more what the debugging behavior you're seeing is?

The semicolon is its own statement. So you need to step through it.

That's probably a debugger improvement, probably an item for Visual Studio?

The additional advantage is someone needs to learn the workaround, after being bitten by it in the first place. I had to go and find all the places this was possibly happening and update them to include semicolons, which is not a natural instinct.

How can it be happening elsewhere, if the code doesn't compile without the semicolon? Or do you mean that you had actually added some "no-op" line as your previous workaround?

TahirAhmadov avatar Nov 24 '21 14:11 TahirAhmadov

How can it be happening elsewhere, if the code doesn't compile without the semicolon?

Source generators can have many different paths they take to output the source, in many combinations. It can take the right (wrong) sequence of events to end up with a label at the end of scope, in which case it fails to compile, but you don't discover it until you get that exact right sequence that leads to the label being not followed by another statement.

This is one of the few cases where since goto is used rarely

I expect it'll be used more commonly in source generators. In particular for source generators used to implement DSLs.

labels without a statement are even rarer

Again, the reason I raised source generators as the primary case for this is because they end up generating code that a human may not think to write.

stephentoub avatar Nov 24 '21 14:11 stephentoub

Again, the reason I raised source generators as the primary case for this is because they end up generating code that a human may not think to write.

Isn't this also code that a human is less likely to have to read? If the author of a source generator is concerned about how their code looks and can be parsed by a human I would think that the issue of following a label with an empty statement would be the least of their concerns.

HaloFour avatar Nov 24 '21 14:11 HaloFour

Isn't this also code that a human is less likely to have to read?

"have to read", sure. But it's code being emitted into the project. It's part of the app. You debug through it just as you debug through code you wrote by hand. Source generator authors are (or at least, in my opinion, should be) concerned about making that code as nice to read and understand and debug as possible.

the issue of following a label with an empty statement would be the least of their concerns

It is a concern. I've never said it's the most important thing in the world.

I'm raising a real issue I've hit and that has a strange-looking workaround I had to discover and fix multiple times only after tripping over it multiple times.

stephentoub avatar Nov 24 '21 14:11 stephentoub

@stephentoub

It is a concern. I've never said it's the most important thing in the world.

I'm raising a real issue I've hit and that has a strange-looking workaround I had to discover and fix multiple times only after tripping over it multiple times.

Sure, I guess my point is that if you were optimizing for human readability of that generated code you very likely wouldn't be reaching for a label to begin with, especially not at the end of a block. This situation arises specifically because the code is generated and is already very unlike what a human would write manually.

I'm not arguing against this proposal specifically, but I do think that it is kinda opening a can of worms to consider that the language syntax should be changed to reduce friction specifically for source generators.

HaloFour avatar Nov 24 '21 15:11 HaloFour

you very likely wouldn't be reaching for a label to begin with, especially not at the end of a block. This situation arises specifically because the code is generated and is already very unlike what a human would write manually.

The situation arises because it needs a branching structure that can't be represented well with if/elses, loops, etc. Sometimes gotos are actually the cleanest / most readable answer.

stephentoub avatar Nov 24 '21 15:11 stephentoub

@stephentoub

The situation arises because it needs a branching structure that can't be represented well with if/elses, loops, etc. > Sometimes gotos are actually the cleanest / most readable answer.

I'm not arguing against the use of goto either, although I would bet money that if you handed this same problem to a room full of average developers the number who would use goto would be in the extreme minority. 😁

HaloFour avatar Nov 24 '21 15:11 HaloFour

One thing I think is worth pointing out is that this feature does not complicate the language. It removes a wrinkle (I can't put a label at the end? I need an empty statement? What's an empty statement?) and as far as I can see smoothly generalizes the syntax. So I think the language would be the better for it.

The main question then is, is it worth our effort? As has been pointed out above, use of goto in manually written code is relatively rare. The main scenario where this is useful is in source generators, where the lack of regularity in the language today leads either to a more complicated generator or less natural source output.

There's a difference between a feature that is for source generation versus one that is motivated by it. If source generator scenarios are what can cause a small language clean-up to rise above the value threshold to invest in it then I have no problem with that.

One thing that should be considered part of the cost of this feature is that it will change the shape of syntax trees for all existing uses of labels. Instead of being one labeled_statement containing another statement they will now be a label_statement followed by another statement. Chasing down the syntax tree consumers and fixing them up may turn out to be a bigger cost than implementing the change in Roslyn itself.

Either way it seems to me to be a balancing of one-time cost vs long-term value. There does not seem to me to be ongoing cost (to the language, the compiler codebase or downstream tools) associated with this feature.

MadsTorgersen avatar Nov 24 '21 18:11 MadsTorgersen

especially not at the end of a block

@HaloFour I don't think a label appearing at the end of a block is more special or unlikely than appearing at all.

if (...)
{
    for (...)
    {
        for (...)
        {
            if (...) goto afterBothLoops;
            // ...
        }
    }
    // There just happens to be nothing to do between the outer loop and the end of the scope containing it
    afterBothLoops:
}

If if/for/for/if seems unlikely, picture using/for/switch.

jnm2 avatar Nov 24 '21 18:11 jnm2

@MadsTorgersen

One thing that should be considered part of the cost of this feature is that it will change the shape of syntax trees for all existing uses of labels.

It doesn't have to. We could represent this as a labeled statement with a true 'empty' child statement. This could be a new node type, or reusing the ';' statement (with a missing token), or just allowing the child to be optional here.

That would keep the shape for all existing labels, but only as a new shape to understand for this new case.

CyrusNajmabadi avatar Nov 24 '21 18:11 CyrusNajmabadi

@CyrusNajmabadi

It doesn't have to. We could represent this as a labeled statement with a true 'empty' child statement. This could be a new node type, or reusing the ';' statement (with a missing token), or just allowing the child to be optional here.

Clever! It would not be the most natural representation but that might well be worth it for the back compat.

MadsTorgersen avatar Nov 24 '21 18:11 MadsTorgersen

Clever! It would not be the most natural representation but that might well be worth it for the back compat.

Right. Note, we often take the most 'compatible' representation, often with an eye toward 'ease of consumption' for the compiler model. For example, 'file scoped namespaces' are represented in a way such that the items that follow it in the file are actually represented as children of it (in the same fashion as normal namespaces). This allows high potential for continued reuse of the current shape, with little need to special case.

CyrusNajmabadi avatar Nov 24 '21 18:11 CyrusNajmabadi

The semicolon is its own statement. So you need to step through it.

And the solution is to add yet another statement (that you'd have to step through)?

gafter avatar Apr 06 '22 18:04 gafter

And the solution is to add yet another statement (that you'd have to step through)?

You already step through it.

If I have:

A();
Label:;
B();

and I'm on A(), I press F10 to take me to Label, then F10 to take me to the semicolon, then F10 to take me to B().

And if I have:

A();
Label:
B();

and I'm on A(), I press F10 to take me to Label, then F10 to take me to B().

From a stepping perspective, Label is already a statement.

stephentoub avatar Apr 06 '22 19:04 stephentoub