needle icon indicating copy to clipboard operation
needle copied to clipboard

Compiling string matching algorithms and regular expressions to java bytecode

Results 11 needle issues
Sort by recently updated
recently updated
newest added

Users might want to create subclasses of Pattern at build-time, not run-time. A solution could be more or less involved (these aren’t mutually exclusive): 1. A method to compile a...

Figure out how to handle very large regexes--the strategy we're using generates very large class files. a04107b097ff559542c4e9c2bcd937eed93145dd substantially increased the size of generated regexes, making the problem worse, and required...

The typical approach for determining the next state while looping through characters is to look it up from an array, after determining the byteClass for that character. We determine the...

When compiling a regex, there are several decisions we make that affect performance that depend on assumptions about the texts we'll be compiling against. When choosing whether to use a...

Matching against a `byte[]` is likely to have lower overhead than matching against a `String`, as the compiled code would no longer contain as much functionality inlined from `String#charAt`. If...

There are a number of unsupported meta-characters that the standard Java regexes support. While I don't know if I'd support all of them, we should make sure that, for sake...

I have written a [regex benchmark](https://github.com/almondtools/regexbench) comparing different regex engines for Java. Lately I found your approach and would be curious how it performs compared to the other alternatives: -...

The pattern interface supports three methods (matches, containedIn, find). When compiling to a Java class, each of these methods plays a role in the size of the class. Since the...

When searching for a literal in a string, we'd prefer to use String#indexOf, as it's implemented using Hotspot intrinsics, which ought to outperform anything we can write by hand. Sadly,...