antlr-kotlin
antlr-kotlin copied to clipboard
Performance Degradation
I just tried using a couple different version of this library to test things out. I started with version 0.0.4 and then tried one of the latest ones form jitpack cd716fb1a9. A test I had parsing a 13MB JSON file that used to run in about 300-400ms on average (running 10 times) jumped to over a second in the latest version. The grammar is just the JSON grammar in the antlr repository. I also ran just the same file with just the regular java antlr target and its performance lines up with what I get from version 0.0.4. The JSON file is just a long JSON file generated by JSON file generator. Degradation was visible in both JVM and JS and it is generally 2x slower. Running a profiler on it seems to show we are spending a lot more CPU time in org.antlr.v4.kotlinruntime.StringCharStreamKt.codePointIndices, 40% of all CPU time in fact.

Profiler run for version 0.4 doesn't show that

Thank you for reporting this! We should analyze and fix this
Root cause was mapNotNull creating an ArrayList under the hood to hold elements.
The ArrayList (obviously) keeps resizing, especially with big input strings.
So we had multiple primitive int array copies going on, plus the final copy with toIntArray.
This should now be solved by simply creating an int array of the max possible size (string.length), and then slicing it a single time at the end to cut it to the right size.
Let's see how it goes on JS with big inputs. I know the browser (or Node) will limit the size of arrays.
As an idea, we could swap it with a proper bit set.
@piacenti do you think this is still valid?
Last version contains another small performance boost.