C# Codegenerator: Escape character in substitution unnecessary, even wrong
Bug Description
When you are using the @"" to mark it as an verbatim string literal Escape characters are not required. To display a backslash you only need a single \ in the text. Also escape characters like \n or \t are not supported. The string must contain the actual non-visible characters instead.
Reproduction steps
Setup regex101 like this:
| Setting | Value |
|---|---|
| Flavor | C# |
| Function | Substitution |
| Regular Expression | Hello\sWorld |
| Test String | This is some text with a Hello World in it |
| Substitution | Hello World\Earth |
In the result view you see the correct result:
This is some text with a Hello World\Earth in it
Switch to Code Generator and you should see code like this:
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"Hello\sWorld";
string substitution = @"Hello World\\Earth";
string input = @"This is some text with a Hello World in it";
RegexOptions options = RegexOptions.Multiline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(input, substitution);
}
}
Extend the code with the following line: Console.WriteLine(result); and run it with try.dot.net
As you can see, the result will be This is some text with a Hello World\\Earth in it instead of expected outcome.
Expected Outcome
This is some text with a Hello World\Earth in it
Either use the correct characters in combination with the @"" or don't use the @ character. Or make the substitution input field a textarea field to provide the opportunity to write multiline replacements like the Test String input. Because this input will be formatted correctly.
Browser
Tested with
| Browser | Version |
|---|---|
| Firefox | 120.0.1 (64-bit) |
| MS Edge | 119.0.2151.97 (64-bit) |
| Google Chrome | 119.0.6045.200 (64-bit) |
OS
Windows 11 22H2
#2186
@NCC1701M: should the substitution text on regex101 be:
-
Hello World\Earthwith a single\backslash, or -
Hello World\\Earthwith two\\backslashes?
When I use a single backslash, I see this in the SUBTITUTION field:
When I use two backslashes, I see this:
This does change the Code Generator string substitution = ... text, of course, but I think that people might worry about an error in their pattern if they see an error like this on the regex101 page.
However, it does appear that the generated code uses @"..." (or whatever delimiter had been selected) for that string substitution = @"..."; field as a string literal instead of a raw string.
I had though we added in the @"..." string literal to the Code Generator output because it fixed a previous bug where certain conditions were not being escaped correctly...
@OnlineCop
The substitution text should be - as I described in the Expected Outcome section: This is some text with a Hello World\Earth in it with a single \.
If in the substitution text are double \ required because a single one is used as an escape character for newlines, tabs, etc. that's fine but in the generated code it has to be a single \.
As I suggested above, it might be easier if the substitution input field would support multi line text. So the user can enter Text like
This is my
multi line substitution
with a simple \ in it.
instead of This is my\nmulti line substitution\nwith a simple \\ in it.