nedit-ng
nedit-ng copied to clipboard
X Resource highlight patterns have "catastrophic backtracking"
The "Resource Continued" pattern of:
^(\s*[^:\s]+\s*:)(?:(\\.)|.)*(\\)\n
Effectively hangs NEdit-ng
(and nedit
) for some inputs. Something as simple as this causes things to hang indefinitely:
Ada:Default\n\tAwk:Default\n\tC++:Default\n\tC:Default\n\tCSS:Default\n\tCsh:Default\n\tFortran:Default\n\tJava:Default\n\tJavaScript:Default\n\tLaTeX:Default\n\tLex:Default\n\tMakefile:Default\n\tMatlab:Default\n\tNEdit Macro:Default\n\tPascal:Default\n\tPerl:Default\n\tPostScript:Default\n\tPython:Default\n\tRegex:Default\n\tSGML HTML:Default\n\tSQL:Default\n\tSh Ksh Bash:Default\n\tTcl:Default\n\tVHDL:Default\n\tVerilog:Default\n\tXML:Default\n\tX Resources:Default\n\tYacc:Default
After debugging the issue, it does technically to "make progress", but that is only a technicality. The speed down is exponential so it quickly goes from seconds to parse, to minutes, to days, to years, etc...
This seems to be a function of how the regex engine works, perhaps the best solution is to say "don't do that" and replace the regex with one that doesn't have this issue if possible.
Interesting. I have an old regex demo that I carry around ..
time perl -e 'print "Starting runaway regex:\n"; if ( "1,2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18,19,20,21,22,23,24,25,zzz" =~ /^(.*?,){27}z/ ) { print "match" } else { print "no match" }'
As shown (1 to 25) it takes about 4 seconds on my Mac, but more numbers (26, 27 ...) blow it up pretty quickly. I forget what it's doing ! but it's a classic regex backtrack explosion.
I mostly use Perl when I need some serious regex to work, it seems to have the best regex, and the doco is great -
https://perldoc.perl.org/perlre
Ciao.