nedit-ng icon indicating copy to clipboard operation
nedit-ng copied to clipboard

X Resource highlight patterns have "catastrophic backtracking"

Open eteran opened this issue 4 years ago • 1 comments

The "Resource Continued" pattern of:

^(\s*[^:\s]+\s*:)(?:(\\.)|.)*(\\)\n

Effectively hangs NEdit-ng (and nedit) for some inputs. Something as simple as this causes things to hang indefinitely:

Ada:Default\n\tAwk:Default\n\tC++:Default\n\tC:Default\n\tCSS:Default\n\tCsh:Default\n\tFortran:Default\n\tJava:Default\n\tJavaScript:Default\n\tLaTeX:Default\n\tLex:Default\n\tMakefile:Default\n\tMatlab:Default\n\tNEdit Macro:Default\n\tPascal:Default\n\tPerl:Default\n\tPostScript:Default\n\tPython:Default\n\tRegex:Default\n\tSGML HTML:Default\n\tSQL:Default\n\tSh Ksh Bash:Default\n\tTcl:Default\n\tVHDL:Default\n\tVerilog:Default\n\tXML:Default\n\tX Resources:Default\n\tYacc:Default

After debugging the issue, it does technically to "make progress", but that is only a technicality. The speed down is exponential so it quickly goes from seconds to parse, to minutes, to days, to years, etc...

This seems to be a function of how the regex engine works, perhaps the best solution is to say "don't do that" and replace the regex with one that doesn't have this issue if possible.

eteran avatar Jan 21 '20 21:01 eteran

Interesting. I have an old regex demo that I carry around ..

time perl -e 'print "Starting runaway regex:\n"; if ( "1,2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18,19,20,21,22,23,24,25,zzz" =~ /^(.*?,){27}z/ ) { print "match" } else { print "no match" }'

As shown (1 to 25) it takes about 4 seconds on my Mac, but more numbers (26, 27 ...) blow it up pretty quickly. I forget what it's doing ! but it's a classic regex backtrack explosion.

I mostly use Perl when I need some serious regex to work, it seems to have the best regex, and the doco is great -

https://perldoc.perl.org/perlre

Ciao.

grege2 avatar Dec 13 '20 20:12 grege2