re2j
re2j copied to clipboard
StackOverflowError for Machine.add
Hey, I was combining many (>10.000) similar file names to a unified regexp (all escaped and |-ed).
In Java it compiles (though is extremely slow), but in Re2/j it fails with a stack overflow:
Caused by: java.lang.StackOverflowError
at com.google.re2j.Machine.add(Machine.java:358)
at com.google.re2j.Machine.add(Machine.java:358)
etc ...
Are you able to share a code snippet demonstrating the problem?
Also, maybe for your use case it's better to add all the strings to a HashSet and use contains.
We hit this as well - this Scala snippet will hit the problem:
import com.google.re2j.Pattern
object TestRE2JStackOverflow {
private val limit = 10000
private val charRange = Range('a', 'z')
def main(args: Array[String]): Unit = {
val patterns =
for (charOne <- charRange;
charTwo <- charRange;
charThree <- charRange;
charFour <- charRange) yield {
val codepoints = Array[Int](charOne, charTwo, charThree, charFour)
val newString = new String(codepoints, 0, 4)
val stringToUse = ".*" + newString + "$"
stringToUse
}
val combinedPatternString = patterns.slice(0, limit).mkString("(?:", "|", ")")
val p: Pattern = Pattern.compile(combinedPatternString)
p.matches(" one two three four test ")
}
And the stack overflow:
at com.google.re2j.Machine.add(Machine.java:380)
at com.google.re2j.Machine.add(Machine.java:380)
at com.google.re2j.Machine.add(Machine.java:380)
at com.google.re2j.Machine.add(Machine.java:380)
at com.google.re2j.Machine.add(Machine.java:380)
Which originates from:
at com.google.re2j.Machine.add(Machine.java:380)
at com.google.re2j.Machine.match(Machine.java:253)
at com.google.re2j.RE2.doExecute(RE2.java:246)
at com.google.re2j.RE2.match(RE2.java:283)
at com.google.re2j.Matcher.genMatch(Matcher.java:327)
at com.google.re2j.Matcher.find(Matcher.java:304)
This is RE2J 1.4