boa
boa copied to clipboard
Optimization fast local variables
Depends on #3185
This PR is a WIP, it implements an optimization on function local variables, functions that do not call eval
directly or variables that are not escaped, we can keep then on the stack after frame pointer and use and offset to index them (index = fp + offset
). This allows us to eliminate the creation of some of the gc collected environments. Which gives us huge performance improvements!
Benchmarks
Main
Uncaught Error: Benchmark had 1 errors
PROGRESS Richards
RESULT Richards 36.0
PROGRESS DeltaBlue
RESULT DeltaBlue 42.2
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 63.2
PROGRESS RayTrace
RESULT RayTrace 147
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 136
ERROR RegExp TypeError: not a constructor
undefined
PROGRESS RegExp
PROGRESS Splay
RESULT Splay 118
PROGRESS NavierStokes
RESULT NavierStokes 136
SCORE 84.6
Uncaught Error: Benchmark had 1 errors
PR
PROGRESS Richards
RESULT Richards 42.3
PROGRESS DeltaBlue
RESULT DeltaBlue 46.5
PROGRESS Encrypt
PROGRESS Decrypt
RESULT Crypto 102
PROGRESS RayTrace
RESULT RayTrace 164
PROGRESS Earley
PROGRESS Boyer
RESULT EarleyBoyer 158
ERROR RegExp TypeError: not a constructor
undefined
PROGRESS RegExp
PROGRESS Splay
RESULT Splay 124
PROGRESS NavierStokes
RESULT NavierStokes 246
SCORE 107
Almost 2x improvment on NavierStokes :)
Test262 conformance changes
Test result | main count | PR count | difference |
---|---|---|---|
Total | 95,574 | 95,574 | 0 |
Passed | 75,273 | 75,273 | 0 |
Ignored | 19,482 | 19,482 | 0 |
Failed | 819 | 819 | 0 |
Panics | 0 | 0 | 0 |
Conformance | 78.76% | 78.76% | 0.00% |
Codecov Report
Attention: 171 lines
in your changes are missing coverage. Please review.
Comparison is base (
a51581b
) 49.53% compared to head (d53a0c4
) 49.60%. Report is 5 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #3194 +/- ##
==========================================
+ Coverage 49.53% 49.60% +0.06%
==========================================
Files 446 446
Lines 43722 44070 +348
==========================================
+ Hits 21659 21861 +202
- Misses 22063 22209 +146
Files | Coverage Δ | |
---|---|---|
boa_engine/src/bytecompiler/expression/assign.rs | 42.13% <100.00%> (+3.88%) |
:arrow_up: |
...gine/src/bytecompiler/expression/object_literal.rs | 54.16% <100.00%> (ø) |
|
boa_engine/src/bytecompiler/expression/unary.rs | 100.00% <100.00%> (ø) |
|
boa_engine/src/bytecompiler/function.rs | 97.40% <100.00%> (-2.60%) |
:arrow_down: |
boa_engine/src/bytecompiler/jump_control.rs | 94.47% <100.00%> (+0.05%) |
:arrow_up: |
boa_engine/src/bytecompiler/statement/block.rs | 100.00% <100.00%> (ø) |
|
boa_engine/src/bytecompiler/statement/loop.rs | 77.23% <100.00%> (+2.13%) |
:arrow_up: |
boa_engine/src/bytecompiler/statement/mod.rs | 91.17% <100.00%> (ø) |
|
boa_engine/src/bytecompiler/statement/switch.rs | 100.00% <100.00%> (ø) |
|
boa_engine/src/bytecompiler/statement/try.rs | 91.11% <100.00%> (+0.20%) |
:arrow_up: |
... and 17 more |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Blocked on #3490
@HalidOdat I would like to take a look at continuing development on local variables. We talked about having environment aware bytecode generation for this with analyzed variable scopes. Do you have any more work or ideas on this or should I just start from the current state of this PR? Also, should I keep anything in mind regarding #3798?
Do you have any more work or ideas on this or should I just start from the current state of this PR?
The issue I have with the current implementation is that it does not account for many cases (and it's very hacked together).
My idea for a more proper solution would be to move the compile time environment logic to the parser and have the AST nodes like *Function*
, Block
contain a compile time environment, LibJS does this. It removes the need for a second pass after parsing (though makes parsing slower if we want to fail fast, but most scripts are valid so I don't think this is an issue).
The second idea was to perform a second pass after the AST generation this would generate a data structure that contains all the scope data that is passed to the compiler, and add a ID to the AST nodes that is None
initially and filled in with the scope it points to, making mapping back to the scopes more efficient. This is a simpler approach and I was experimenting with it.
I began working on this but didn't make much progress, you can find the code on my fork (hope it helps): https://github.com/HalidOdat/boa/tree/scope-analysis
Also, should I keep anything in mind regarding #3798?
Hmmm.. The logic on how to allocate register/local variables in the compiler will be shared, besides that I think both tasks can be worked on simultaneously.
I'll create a PR and extract that logic so we can merge it into main
.