hermes icon indicating copy to clipboard operation
hermes copied to clipboard

Add Hermes' Canonicalization Optimization Pass

Open sebholl opened this issue 2 years ago • 1 comments

Summary

In response to the discussion with @tmikov around https://github.com/facebook/hermes/pull/1071, this PR introduces a brand new canonicalization step to the optimizer. The purpose of the canonicalization step is to standardize the syntax for code with the same semantics.

In the first implementation, we perform three basic canonicalization steps:

  1. We move the constant/literal operator evaluations from InstSimplify to InstCanonicalize. I felt this important in the canonicalization pass as it normalizes the various ways that minifiers represent literals into their unobfiscated form (e.g. void 0 becomes undefined; !0 becomes true and !1 becomes false).
  2. We flip constant/literal expressions to the right-hand side of binary expressions. This opens up more opportunity for CSE. Note, we could flip literals to the LHS but empirically this lead to worse optimization outcomes.
  3. We normalize all nulls in != and == operators to undefined (one of the optimization opportunities originating in https://github.com/facebook/hermes/pull/1071). Again this could have equally been flipped to map undefined to nulls but as documented in that PR, null lead to worse optimization outcomes.

As for the pipeline, I propose running the canonicalization step before every InstSimplify run as well as before the CSE step. This has yielded promising results so far. Introducing these extra passes has increased compilation time by about 2% but from the conversation with @tmikov, it appears this is acceptable.

Clean-Up

  • For BinaryOperatorInst::tryGetReverseOperator(...), we remove + from the list of reversible operators as I ran into the pitfall of + operating in "concatenation" mode if one of its runtime operands is a string which clearly isn't "reversible". This change is assumed safe as it had no existing callers in the Hermes repo. I don't know if this function is linked to from other libraries, but we can land the PR without this change if preferred.
  • As (hopefully) expected from canonicalization approach 2, we can now simplify InstSimplify::simplifyBinOp(...) pattern matching for (x | 0) and (x | "") cases to eliminate their symmetric checsk against (0 | x) and ("" | x).

Test Plan

We trial different variations of the implementations on a 33MB production app React Native JS bundle and compare it to the base case of "No Change"/"No Change" with the full -O optimization process:

Binary Operator Literals ==/!= Null/Undefined -dump-bytecode Line Count -dump-lir Line Count -emit-binary Size (bytes) Execution Time (s)†
To The Right Always Undefined 3,938,853 2,726,204 20,121,992 308.28
No Change Always Undefined 3,939,077 2,726,205 20,127,753 307.84
To The Left Always Undefined 3,939,233 2,726,251 20,128,289 306.27
To The Right No Change 3,943,062 2,729,947 20,133,420 306.21
No Change No Change 3,943,299 2,729,948 20,139,284 303.03
To The Left No Change 3,943,455 2,729,994 20,139,812 307.21

†Execution time is measured on an M2 MacBook Air with all processes running in parallel on separate cores.

In all observed cases, swapping literals to the right always resulted in better optimization.

_Tangent: Curiously, swapping literals to the left resulted in consistently worse performance across all metrics (actually worse than doing no normalization). It seems to be not applying the AddString simplification optimizations.

Finally, we introduce a new canonicalize.js test file and run...

% cmake --build ../build --target check-hermes all -j 4 
[25/26] Running the Hermes regression tests
Testing Time: 20.31s
  Expected Passes    : 1682
  Unsupported Tests  : 62

sebholl avatar Jul 30 '23 20:07 sebholl

@tmikov - any comment on these PRs?

shollington-rbi avatar Aug 09 '23 01:08 shollington-rbi