vscode-textmate icon indicating copy to clipboard operation
vscode-textmate copied to clipboard

100% Freeze on any catastrophic backtracking

Open jeff-hykin opened this issue 6 years ago • 4 comments

Reasons a change is needed:

  • Debugging catastrophic backtracking is tough because the entire process freezes. Inside of VS Code, basically the only way to find the bug is to guess what it is, and the see if the process still crashes.
  • Catastrophic backtracking is easy to accidentally introduce, even for regex veterans
  • Catastrophic backtracking is hard to write tests for; it can require a very specific circumstance to become catastrophic

Recommended change: Find some way to have a timeout on a per-line check. If the line goes past the timeout (ex: 5 seconds) the parser just groups the entire rest of the document (all of the remaining lines) as one scope, and closes any unclosed scopes. The reason being: that it would fail safely (no freeze) but would fail badly (not-user friendly) so that the failsafe is not used a a crutch to allow poorly written regex to become acceptable. Failing like this would also give the dev the ability to see exactly which line the error occurred at.

Difficulty: Probably pretty difficult. The only solutions I can imagine are that the code needs to be async (I'd assume its currently synchronous), or some advanced tool like worker threads would need to be used (pass the current line to the worker, let it start working, if the worker doesn't finish in the timeout, then fail)

jeff-hykin avatar Aug 13 '19 18:08 jeff-hykin

We call into oniguruma, which is written in C++. Control never returns and it does not have a timeout option of any sorts. In the past, I have "debugged" such cases only by logging in node-oniguruma right before the call into oniguruma.

Here I would add a printf for source_.c_str() so I'd get to find out which regular expression begins and never terminates.

alexdima avatar Aug 13 '19 18:08 alexdima

@alexandrudima Is the call into oniguruma from this upstream https://github.com/kkos/oniguruma? I can post an issue on there.

Funny enough, I'd bet this issue is much easier to fix in C++ thanks to multithreading.

Thanks for the debugging advice too.

jeff-hykin avatar Aug 13 '19 18:08 jeff-hykin

We have plans to improve things for VS Code. See https://github.com/microsoft/vscode/issues/77140 where we want to look into running the TM grammars on separate threads.

We consume oniguruma via the node-oniguruma project at https://github.com/atom/node-oniguruma . I'm not entirely sure why, but they have a pretty old version of oniguruma

alexdima avatar Aug 14 '19 10:08 alexdima

That is an awesome post, thanks for linking me to it.

jeff-hykin avatar Aug 14 '19 15:08 jeff-hykin