needle
needle copied to clipboard
Large regexes are poorly handled
Figure out how to handle very large regexes--the strategy we're using generates very large class files.
a04107b097ff559542c4e9c2bcd937eed93145dd substantially increased the size of generated regexes, making the problem worse, and required commenting out a previously used test case.
The problem isn't gone, but some recent work to enable use of byteclasses is probably helpful. The regex Holmes.{0,25}Watson|Watson.{0,25}Holmes now generates a 340 KB class, instead of 2MB.
Subsequent updates to encoding make Holmes.{0,25}Watson|Watson.{0,25}Holmes only 245181 bytes.