csharp-tmLanguage icon indicating copy to clipboard operation
csharp-tmLanguage copied to clipboard

Freeze when dealing with long (unbreakable) strings

Open aeschli opened this issue 8 years ago • 8 comments

From @kwazel on March 15, 2017 7:29

  • VSCode Version: 1.10.2
  • OS Version: Windows 6.1 (Build 7601), Windows 10 Version 1607 (OS Build 14393.0)

Steps to Reproduce:

  1. Open a C# file with a long string (e.g. the attached file with extension .cs)
  2. Syntax coloring may start but not progress past the long string
  3. Visual Studio Code will become unresponsive

crasher2.cs.txt

Copied from original issue: Microsoft/vscode#22651

aeschli avatar Mar 15 '17 11:03 aeschli

The freeze happens when the following regex is evaluated:

(?x)
(?!.*\b(?:class|interface|struct|enum|event)\b)\s*
(?<return-type>
  (?<type-name>
    (?:
        (?:(?<identifier>[_[:alpha:]][_[:alnum:]]*)\s*\:\:\s*)? # alias-qualification
        (?<name-and-type-args> # identifier + type arguments (if any)
          \g<identifier>\s*
          (?<type-args>\s*<(?:[^<>]|\g<type-args>)+>\s*)?
        )
        (?:\s*\.\s*\g<name-and-type-args>)* # Are there any more names being dotted into?
        (?:\s*\*\s*)* # pointer suffix?
        (?:\s*\?\s*)? # nullable suffix?
        (?:\s*\[(?:\s*,\s*)*\]\s*)* # array suffix?
    )|
    (?<tuple>\s*\((?:[^\(\)]|\g<tuple>)+\))
  )\s+
)
(?<interface-name>\g<type-name>\s*\.\s*)?
(?<property-name>\g<identifier>)\s*
(?=\{|=>|$) 

in

Console.WriteLine("i=OneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOneAndOne")

aeschli avatar Mar 15 '17 11:03 aeschli

We added debug statements to the oniguruma code to print the regex it was going to evaluate. The given regex set took a very long time to evaluate. I don't have any further insights.

If you want to do the same (I did it on Linux):

  • in a vscode workspace look for /node_modules/oniguruma/src/onig-reg-exp.cc
  • at line 52 add printf("Searching for %s in %s\n", source_.c_str(), data+position);
  • open a terminal in the oniguruma folder and run node-gyp build to rebuild the c code.
  • run /scripts/code.sh from the terminal, open the problematic file and observe the the output in the terminal you started code.sh from.

aeschli avatar Mar 23 '17 14:03 aeschli

FWIW, trying it out in Rubular, it appears that problem is the (?!.*\b(?:class|interface|struct|enum|event)\b)\s* clause. Removing that causes it to work just fine. I'll have to sort that out. Thanks for the help!

DustinCampbell avatar Mar 23 '17 14:03 DustinCampbell

@aeschli: out of curiosity where did this file come from? It's not syntactically correct (missing semicolons).

DustinCampbell avatar Mar 25 '17 13:03 DustinCampbell

@kwazel is the original issuer, maybe he knows.

aeschli avatar Mar 27 '17 09:03 aeschli

I wrote that piece of code by hand (can not give you the code I was having this problem with). It is true that I missed out a semi-colon. Sorry for that, but it seems hardly relevant, because I do not want my editor to freeze with invalid (or incomplete) code as well.

kwazel avatar Mar 27 '17 10:03 kwazel

Thanks! I was just curious whether this had come from some sort of official performance test. If that were true, I'd want to correct the test case.

Note that missing semicolons aren't the only issue. The statements in this sample are at the class level (i.e. not included in a method body), which is why the regular expression for property declarations is running. I agree that it shouldn't freeze and I understand the importance of broken and incomplete code, but I also think this particular scenario is somewhat rare -- especially given that the sample contains a 3000 character single-line string literal.

I'm experimenting with ways to address this. It's pretty tricky though. :smile:

DustinCampbell avatar Mar 27 '17 13:03 DustinCampbell

The real scenario I was facing was due to generated (actually converted) code. By now I have fixed that code to contain newlines, but it did not use to have these. You are right that it was in a method, not at class level. I reproduced the file and attached it. vsc_freezer.cs.txt

On Monday, 27 March 2017, 15:58, Dustin Campbell <[email protected]> wrote:

Thanks! I was just curious whether this had come from some sort of official performance test. If that were true, I'd want to correct the test case.Note that missing semicolons aren't the only issue. The statements in this sample are at the class level (i.e. not included in a method body), which is why the regular expression for property declarations is running. I agree that it shouldn't freeze and I understand the importance of broken and incomplete code, but I also think this particular scenario is somewhat rare -- especially given that the sample contains a 3000 character single-line string literal.I'm experimenting with ways to address this. It's pretty tricky though. 😄 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

kwazel avatar Mar 27 '17 14:03 kwazel