PowerShell-RFC icon indicating copy to clipboard operation
PowerShell-RFC copied to clipboard

Add Update here-string syntax RFC

Open MartinGC94 opened this issue 2 years ago • 4 comments

MartinGC94 avatar Jan 23 '23 23:01 MartinGC94

The Language-WG met to discuss this on 8/10/23 and had the following observations and conclusions:

  • we agreed that there is substantial risk in altering the behavior of our current HereString and aren't willing to change the current HereString syntax
  • we agreed that there is substantial value in supporting a new syntax which allows indentation of the HereString to be handled more gracefully.
  • we felt that supporting a single-line HereString is not an overwhelming value and are concerned that any implementation may lead to an increased support burden.
  • we agreed that an appropriate syntax to designate the new behavior should be @""
    • we believe it may have reduced implementation cost as the tokenizer is already recognizing @" and this would be a special case of what we already recognize
    • we don't preclude the addition of supporting a single line HereString in the future

Specifics:

  • the behavior should follow the same behavior of our current HereString with regard to single and double quotes - e.g. @'' vs @"" being constant and expandable respectively.
  • the starting token (@"" or @'') must be the last non-whitespace token of a line
  • the ending token (""@ or ''@) must be the first non-whitespace token of a line
  • the column offset of the ending token (""@) will be used to determine how much white space at the beginning of the line to trim.
  • if that whitespace does not exist in a line, a parse error should be generated (similar to the error that c# generates)
  • the starting and ending token lines are not part of the HereString
  • while the arguments for @@' were very well put together and persuasive we ultimately still prefer @'' for understandability

we look forward to your updates

JamesWTruher avatar Aug 10 '23 19:08 JamesWTruher

@JamesWTruher (and WG) thanks for the review. I have a few questions. First, regarding:

the starting token (@"" or @'') must be the last token of a line

This is different from the current @' syntax which allows whitespace characters after the header (the whitespace chars are not included in the string value though). Are you sure you want this slight difference between the old and new syntax?

if that whitespace does not exist in a line, a parse error should be generated (similar to the error that c# generates)

What about empty lines like line2 here:

    @''
    Line1

    Line3
    ''@

Should they also have that whitespace, or can they be left completely empty? Editors like VS code will not indent empty lines and some tools will auto remove trailing whitespace so I think the UX will suffer if we make the whitespace mandatory for empty lines.

MartinGC94 avatar Aug 10 '23 22:08 MartinGC94

@JamesWTruher (and WG) thanks for the review. I have a few questions. First, regarding:

the starting token (@"" or @'') must be the last token of a line

This is different from the current @' syntax which allows whitespace characters after the header (the whitespace chars are not included in the string value though). Are you sure you want this slight difference between the old and new syntax?

sorry, I've updated that - it should be the last non-whitespace token, any spaces which follow are ignored.

As an aside, the current error message is curious, yes?

> @"           a     
ParserError: 
Line |
   1 |  @"           a
     |               ~
     | No characters are allowed after a here-string header but before the end of the line.

strictly speaking " " is a character.

In any event, we're not interested in introducing any differences with the starting token

if that whitespace does not exist in a line, a parse error should be generated (similar to the error that c# generates)

What about empty lines like line2 here:

    @''
    Line1

    Line3
    ''@

Should they also have that whitespace, or can they be left completely empty? Editors like VS code will not indent empty lines and some tools will auto remove trailing whitespace so I think the UX will suffer if we make the whitespace mandatory for empty lines.

wrt to empty strings - we didn't discuss it, but it's a good question. I expect the scanner should just ignore empty lines. We were more worried about the behavior of:

$a = @"
        line 1
  line 2
        line 3
        "@

c# emits a syntax error in this case, and we didn't think that we should try to be different here. I'm not sure what c# does in the empty line case, we should probably follow that lead unless there's a good reason for us to be different.

JamesWTruher avatar Aug 11 '23 19:08 JamesWTruher

@JamesWTruher I've updated the RFC to remove the single line here-string references and to use the multiple quotes syntax rather than the @ symbols. I also made the specification more precise. Also, I tested the C# string literal behavior for empty lines and lines that only consist of whitespace and found that any lines with less than or equal to the amount of whitespace that the "footer" line has are considered linebreaks so I will do the same for the PS version.

MartinGC94 avatar Aug 11 '23 23:08 MartinGC94