prettier-java
prettier-java copied to clipboard
Upgrade to chevrotain 7
Why ?
We should try to have latest versions of our libs to include easily their improvements (bugs, enhancements, etc.)
What ?
I tried to upgrade, but ran into an error:
jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:184
throw e;
^
TypeError: Cannot read property 'location' of undefined
at JavaParser.TreeBuilder.cstPostNonTerminal (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/tree_builder.js:160:73)
at JavaParser.cstPostNonTerminal (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/packages/java-parser/src/parser.js:81:11)
at JavaParser.RecognizerEngine.subruleInternal (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:423:18)
at JavaParser.RecognizerApi.SUBRULE (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_api.js:72:21)
at JavaParser.<anonymous> (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/packages/java-parser/src/productions/interfaces.js:241:7)
at JavaParser.invokeRuleWithTry (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:117:33)
at JavaParser.wrappedGrammarRule [as annotation] (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:131:38)
at JavaParser.RecognizerEngine.doSingleRepetition (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:385:16)
at JavaParser.RecognizerEngine.manyInternalLogic (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:318:29)
at JavaParser.RecognizerEngine.manyInternal (/Users/clementdessoude/Documents/dev/jhipster/prettier-java/node_modules/chevrotain/lib/src/parse/parser/traits/recognizer_engine.js:295:21)
@bd82 do you think you could have a look ? I have to admit that I am a bit lost on the origin of the issue.
Can we add a bounty on this @pascalgrimaud ?
Starting with 200, which can be increased if it's a lot of work
Chevrotain version 9.x is out
There was a conditional inside the cstPostNonTerminal
method.
It seems to have been removed in newer versions
- https://github.com/Chevrotain/chevrotain/blob/v6.5.0/packages/chevrotain/src/parse/parser/traits/tree_builder.ts#L268-L272
Try putting the super
call inside the conditional check in the override of the JavaParser
- https://github.com/jhipster/prettier-java/blob/main/packages/java-parser/src/parser.js#L80-L81
Not sure a bounty is needed for this...
Thanks a lot @bd82, it looks indeed that your suggestion works !
Even if it was so much work, you deserve the bounty for all the great work you did on this lib, so if you want to take it, it's yours !!
I don't have a paypal account :)
It seems that upgrading to chevrotain >=7 increase parser execution by 3 (for a file over 6000 lines) :/
data:image/s3,"s3://crabby-images/58078/58078dd8a7168689d9f07dd3c86a27415825922f" alt="image"
If it reduces the execution time, it is a good news, isn't it? Don't understand your smiley @clementdessoude
If it reduces the execution time, it is a good news, isn't it? Don't understand your smiley @clementdessoude
Increase, sorry
Oh in this case, it is not a good news indeed
Is that for most test files or just for that single one?
Comparing the logs between these two pipeline, it seems a common behaviour:
For instance ConfigFileApplicationListener was parsed in 56ms in the first pipeline, and 140ms in the second one
It is approximately the same on my computer
There are not supposed to be any performance regressions, (definitively none so large). The main performance hog with Java Parser is the backtracking done to keep the grammar strongly aligned with the specifications. Perhaps somehow the way that "smart" backtracking has been implemented its not as efficient as with newer versions of Chevrotain?
- https://github.com/jhipster/prettier-java/blob/main/packages/java-parser/src/parser.js#L92-L112
Possible things that can go wrong:
- Maybe
this.outputCst = false;
does not disable as many things as it did in 6.x (what caused the execption) so more work is being done while backtracking? - Maybe
isRecognitionException()
behaves differently and the block does not return?
I'd think the first one is more likely
Is the performance drop happening with the update to 7.x or to 9.x?
Is the performance drop happening with the update to 7.x or to 9.x?
It's in the update to 7.x.
I will try to investigate it, but not sure when.
I've finally have some time to start looking into it.
I've had trouble with the local dev env because of windows + node-gyp + python.
What I'm doing now is:
-
running the benchmark on the folder below:
-
But without the latest released version, only against the master version. (because I could not get
niv
to work).- note this it sometimes seems to me that benchmarking two versions of the similar library will give more consistent results if each library is run in a different process,.
-
and on this branch: https://github.com/clementdessoude/prettier-java/tree/chore/upgrade-chevrotain
-
and changing the chevrotain version in the package.json and executing
yarn
between benchmarks.
I can detect a performance regression switching from 6.5 to 7.0, but not x3 slower, rather 25-30% reduction. I am not sure why is that as with a quick look I can't spot any commit hat could have cause it between 6.5 and 7.0
- https://github.com/chevrotain/chevrotain/compare/v6.5.0...v7.0.0
Perhaps it is some shenanigans inside of V8 engine's optimization...
Perhaps I can try npm linking
chevrotain and doing a "binary" search to detect the "offensive" commit.
I think I was mistaken, the regression I noticed was likely between 6.5.0 and 7.12 (latest 7.x) As I can not reproduce a performance regression between 6.5.0 -> 7.0.0
But I can now repeatedly reproduce a performance regression between 7.1.0 -> 7.1.1
- https://github.com/Chevrotain/chevrotain/compare/v7.1.0...v7.1.1
Okay this is the culprit: it was a refactor which made the exceptions use ES6 syntax and inheritance.
- https://github.com/Chevrotain/chevrotain/commit/9045686f17ae2238fc709496224f6e52b41e158d
What is happening is that due to the "naive" backtracking nature of the parser (no memorization)
- Backtracking because we wanted to stick as close to the spec as possible (at cost of performance).
A great many exceptions are thrown, when the parser tries and retries backtracking again and again. This seems to be quite extreme.
For samples under: `/packages/java-parser/samples/spring-framework/spring-expression/src/main/java/org/springframework/expression/spel/
Which are not that much code:
github.com/AlDanial/cloc v 1.90 T=0.20 s (579.8 files/s, 91927.9 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Java 118 2363 5101 11246
-------------------------------------------------------------------------------
SUM: 118 2363 5101 11246
-------------------------------------------------------------------------------
The RecognitionException
constructor gets called exactly: 263251 times.
Which is approx 23.4 per "real" line of code.
So when the exception classes became more complex:
- class inheritance
- additional logic in the constructor
It negatively impacted the performance.
- Note that I only saw 30-40% performance r, regressions, but it could depend on the specific set of files and the node engine used.
Possible Solutions
In Chevrotain revert the exceptions implementation to the old style
Problems:
- Won't be available quickly, as Chevrotain's master branch currently contains breaking changes as work on a new major release is being done.
- While I would like to simplify the exceptions (no real need for class inheritance), the scenario of throwing 24 exceptions per line is not exactly "common" so how much should it be optimized for on the library level?
- Some of the logic added to the Exception constructor may have actually been needed to solve some issues with stack traces.
In java-parser implement a simple hack
Basically use something like https://github.com/ds300/patch-package to apply a patch on the exceptions_public.js
file
in chevrotain. I am not sure if the java-parser
needs any information of the state of the exceptions other than the .message
property, so a very simple alternative implementation could be provided.
- Maybe even just replace with the same file.
- alternatively if you can prove you only need the message and no other state, perhaps an optimized version could be "injected" which would provide a speed boost.
This should be easy to perform and not too fragile as the exceptions file does not often change.
In java-parser attempt to implement memoization on the backtracking logic
This is basically trying to improve the parsing backtracking algorithm. ~~It is quite likely the parser could be several times or even an order of magnitude faster if it was smarter on how it backtracks.~~
- Edit: the above statement may be too "strong", lets instead say I suspect there is substantial room for improvements.
But this will require more research to prove and time investment to implement.
I will try to find time to open a separate issue with more details on this tomorrow
I suspect that using the LL(*) plugin for chevrotain could get rid of much of the backtracking used in the Java-Parser.
@msujew WDYT?
Example of backtracking production:
- https://github.com/jhipster/prettier-java/blob/2f7efa41db2423a200236027afb9f0fe8c198b02/packages/java-parser/src/productions/blocks-and-statements.js#L568-L587
Usage of the backtracking production:
- https://github.com/jhipster/prettier-java/blob/2f7efa41db2423a200236027afb9f0fe8c198b02/packages/java-parser/src/productions/blocks-and-statements.js#L27-L51
Backtracking wrapper logic
- https://github.com/jhipster/prettier-java/blob/2f7efa41db2423a200236027afb9f0fe8c198b02/packages/java-parser/src/parser.js#L92-L112
@bd82 I've used a modified version of the parser used in this project for my thesis. I could probably contribute a PR which takes care of this.
Edit: Tried it, but just removing all backtracking isn't possible. The grammar is ambiguous in a lot of places and a lot of tests are failing on me.
Thanks @msujew
Do you have any estimate or guess regarding what percentage of these lookaheads can be solved with LL* and what percentage cannot?
I am trying to understand if it may still be worth using the LL(*) plugin in this project.