re2j icon indicating copy to clipboard operation
re2j copied to clipboard

StackOverflowError for Machine.add

Open l0rinc opened this issue 7 years ago • 2 comments

Hey, I was combining many (>10.000) similar file names to a unified regexp (all escaped and |-ed). In Java it compiles (though is extremely slow), but in Re2/j it fails with a stack overflow:

Caused by: java.lang.StackOverflowError
	at com.google.re2j.Machine.add(Machine.java:358)
	at com.google.re2j.Machine.add(Machine.java:358)
etc ...

l0rinc avatar Oct 06 '18 12:10 l0rinc

Are you able to share a code snippet demonstrating the problem?

Also, maybe for your use case it's better to add all the strings to a HashSet and use contains.

sjamesr avatar Oct 06 '18 14:10 sjamesr

We hit this as well - this Scala snippet will hit the problem:

import com.google.re2j.Pattern

object TestRE2JStackOverflow {

  private val limit = 10000
  private val charRange = Range('a', 'z')

  def main(args: Array[String]): Unit = {
    val patterns =
      for (charOne <- charRange;
        charTwo <- charRange;
        charThree <- charRange;
        charFour <- charRange) yield {

        val codepoints = Array[Int](charOne, charTwo, charThree, charFour)
        val newString = new String(codepoints, 0, 4)

        val stringToUse = ".*" + newString + "$"

        stringToUse
      }

    val combinedPatternString = patterns.slice(0, limit).mkString("(?:", "|", ")")

    val p: Pattern = Pattern.compile(combinedPatternString)

    p.matches(" one two three four test ")
  }

And the stack overflow:

	at com.google.re2j.Machine.add(Machine.java:380)
	at com.google.re2j.Machine.add(Machine.java:380)
	at com.google.re2j.Machine.add(Machine.java:380)
	at com.google.re2j.Machine.add(Machine.java:380)
	at com.google.re2j.Machine.add(Machine.java:380)

Which originates from:

	at com.google.re2j.Machine.add(Machine.java:380)
	at com.google.re2j.Machine.match(Machine.java:253)
	at com.google.re2j.RE2.doExecute(RE2.java:246)
	at com.google.re2j.RE2.match(RE2.java:283)
	at com.google.re2j.Matcher.genMatch(Matcher.java:327)
	at com.google.re2j.Matcher.find(Matcher.java:304)

This is RE2J 1.4

michaelbraun avatar Mar 16 '21 18:03 michaelbraun