eslint-plugin-regexp
eslint-plugin-regexp copied to clipboard
Detect super-linear worst-case runtime
I think we should add a rule that reports cases for exponential backtracking and polynomial backtracking.
However, there are two problems standing in our way:
Performance
Detecting exponential backtracking is computationally expensive.
Static analysis methods might only take a few milliseconds per regex but this will quickly add up for a large codebase (e.g. Prism has about 2k regexes). Fuzzers will usually take about one second or more per regex and are completely impractical for our purposes.
Solutions
-
Since the detection itself is deterministic, a simple cache might largely solve the performance problem.
However, I am not aware of ESLint providing such a feature.
Detection
I am not aware of any existing detector implementation in JavaScript (time of writing: 2021-04-19).
safe-regex
isn't practical. It implements a detection method that simply doesn't work. Star height has little to do with exponential runtime. I.e. (a|a)*b
has a star height of 1 and the (a|a)*
will cause exponential backtracking, and (ab+)*c
has a star height of 2 without any form of exponential backtracking.
Solutions
-
Right now, we could use my library scslre to detect at least some polynomial and exponential backtracking quickly. scslre even offers fixes for simple cases.
-
There is also this method I implemented for Prism. This method is currently used in PrismJS and Highlight.JS. The upside of this method is that it is quick. It can analyze the >2.5k regexes of Prism in less than 5 seconds. The downside of this method is that there are false negatives and it cannot offer fixes. However, the positives it does find are mostly true positives.
I should note that half of this method is already partially implemented by the
no-dupe-disjunctions
rule. If we find two alternatives that are not disjoint and both are children of a star quantifier, then we can reasonably assume that this will cause exponential backtracking (e.g.(a|a)*
). Changing theno-dupe-disjunctions
to add this information should be fairly easy. -
Given that refa recently added ENFAs (not published yet), someone could probably implement a detection method like RXXR fairly easily.
-
I am currently writing my Bachelor thesis on a new static analysis method for exponential and polynomial backtracking. Part of this will be a JS library implementing my new detection method. We could also use this when it's ready (probably in 3-4 months or so).
Related:
- nodesecurity/eslint-plugin-security#28
- sindresorhus/eslint-plugin-unicorn#153