roslyn
roslyn copied to clipboard
Code completion for mixed whitespace characters in raw string
Version Used:
Visual Studio Version 17.4.0 Preview 2.0 C# Tools 4.4.0-2.22430.14+2f760738cb92f32f50c981b68ba04ac3c8b7ee48
Steps to Reproduce:
_ = """
All whitespace characters are the same as the closing line.
U+20, U+2000, U+2001, U+2002, U+2003, U+2004, U+2005, U+2006, U+2007, U+2008, U+2009, U+200A, U+3000
Please insert a new line here:
""";
https://sharplab.io/#v2:CYLg1APg+gBAvDARMgsAKBoAAJCABIIAJDABICAEgoASBgBIOAEgEASCQBIFAEgAAwCCANszAO4AWAlgC4CmAZwAOAQwDG/GOM6iAThIFzBMeVN6cpg0QFspolRqnjmAe0HcAdgHMYzK/wB06LHiJkqdegFUwAJgAGABoYX0CA4ND/CIBGELCIv3jogIBmZPCAFgyIgFYcgIA2AoB2AoAOAoBOAsZk1IiAlxwCEgoaBgAFZn4DKStBfjleVRhLfnY7BxhNOX4QZrc2zwZkRABudCA===
Expected Behavior:
_ = """
All whitespace characters are the same as the closing line.
U+20, U+2000, U+2001, U+2002, U+2003, U+2004, U+2005, U+2006, U+2007, U+2008, U+2009, U+200A, U+3000
Please insert a new line here:
Whitespaces are copied from the closing line.
""";
https://sharplab.io/#v2:CYLg1APg+gBAvDARMgsAKBoAAJCABIIAJDABICAEgoASBgBIOAEgEASCQBIFAEgAAwCCANszAO4AWAlgC4CmAZwAOAQwDG/GOM6iAThIFzBMeVN6cpg0QFspolRqnjmAe0HcAdgHMYzK/wB06LHiJkqdegFUwAJgAGABoYX0CA4ND/CIBGELCIv3jogIBmZPCAFgyIgFYcgIA2AoB2AoAOAoBOAsZk1IiAlxwCEgoaBgAFZn4DKStBfjleVRhLfnY7BxhNOX4QZrc2zwYAdR4BEQkhVTnpU2FufmAYADM5Ux0YI2kzCxsp8ecMFvd2r2REAG4gA=
Actual Behavior:
_ = """
All whitespace characters are the same as the closing line.
U+20, U+2000, U+2001, U+2002, U+2003, U+2004, U+2005, U+2006, U+2007, U+2008, U+2009, U+200A, U+3000
Please insert a new line here:
Visual Studio IDE inserts ASCII spaces (U+0020), which causes CS9003 Error.
""";
https://sharplab.io/#v2:CYLg1APg+gBAvDARMgsAKBoAAJCABIIAJDABICAEgoASBgBIOAEgEASCQBIFAEgAAwCCANszAO4AWAlgC4CmAZwAOAQwDG/GOM6iAThIFzBMeVN6cpg0QFspolRqnjmAe0HcAdgHMYzK/wB06LHiJkqdegFUwAJgAGABoYX0CA4ND/CIBGELCIv3jogIBmZPCAFgyIgFYcgIA2AoB2AoAOAoBOAsZk1IiAlxwCEgoaBgAFZn4DKStBfjleVRhLfnY7BxhNOX4QFxgl5ZWlgDVuQQBXUTYAZV4t4G5TGABJABEAURgBod4VRj2AYTOzmBEJIRgACl9EgIAShCXG4Mmkoi2gxUzz2VQiqRgVzkclMcmcGBa7naXmQiAA3EA===
As a side note, "Convert to raw string" code fix can handle mixed whitespaces correctly:

Could you link to a file that contains the code in question. the above explanation is a bit confusing. Thanks!
Updated to link sharplab codes.
FYI, you can reproduce the issue by using only ASCII spaces (20) and tabs (09).
Could you just make an actual file. I really do not trust having to go into sharplab to determine waht actual characters are at an actual position. A file means there is no confusion at all about the individual bytes in the file. Thanks!
Can you download from https://gist.github.com/ufcpp/0366469fe355705fdfb062ce98570586 ?
Maybe copy-and-paste from the folloing code shows 20 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 200A 3000
foreach (var c in " ")
{
Console.WriteLine($"{(int)c:X}");
}
Thanks. That's probably more than sufficient. I can look on Tuesday!
On a separate note, I've found another strange behavior in Visual Studio while experimenting with this issue. Visual Studio seems to display VT (U+000B) and FF (U+000C) as ♂ and ♀ respectively.
https://github.com/ufcpp/UfcppSample/blob/master/Demo/2022/Csharp11/17.2p1/RawStringLiteral/Whitespaces.cs

@ufcpp That sounds like the file was interpreted as CP437.
@ufcpp Please file that through normal vs feedback. Thanks!
It doesn't seem to be CP437.

I just heard on Twitter that this glyph is used when displaying control characters in Wingdings.
Please file that through normal vs feedback. Thanks!
I'll do it.
https://developercommunity.visualstudio.com/t/Visual-Studio-IDE-displays-ASCII-control/10156578 done.
@allisonchou The issue here is that this is the "indentation service" portion of VS. Specifically, it operates by querying us for the column the user caret should be placed at. However, for raw-strings, this really isn't teh concept we want. For example, with the code this user has it's not spaces/tabs that make up the indentation, but rather specialized whitespace characters.
In order to keep things functioning properly here, we should not do this processing through the indentation system, but have a specialized command handler (like RawStringLiteralCommandHandler) which intercepts this and places the exact right whitespace here before the caret.
That said, this is likely low priority. It will only affect people who happen to indent not using common indentation strategies (spaces/tabs). So it likely can be on the backlog unless we hear about this affecting more people.