LocalizeChildren pass
This pass find Binary instructions where the children have effects, and moves them to locals. This is meant to help with the situation in #7557 (see details there, but basically, the effects of children can prevent OptimizeInstructions from optimizing, and moving those effects outside can help).
This seems generally helpful, though usually it reduces code size by less than 1%, and it does make compilation 5% slower (not because of the pass itself, which is very fast, but likely due to the new locals it adds, that make other things later slower). In more detail, I tested and saw a small improvement on Kotlin, Dart, and Rust testcases, and on Emscripten's code size tests I see this:
Emscripten code size diff
diff --git a/test/code_size/audio_worklet_wasm.json b/test/code_size/audio_worklet_wasm.json
index 5aad516bd2..feeaeefafe 100644
--- a/test/code_size/audio_worklet_wasm.json
+++ b/test/code_size/audio_worklet_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 519,
"a.html.gz": 364,
"a.js": 3853,
"a.js.gz": 2050,
"a.wasm": 1294,
- "a.wasm.gz": 864,
+ "a.wasm.gz": 866,
"total": 5666,
- "total_gz": 3278
+ "total_gz": 3280
}
diff --git a/test/code_size/embind_hello_wasm.json b/test/code_size/embind_hello_wasm.json
index a9884cc49d..c2c96c190a 100644
--- a/test/code_size/embind_hello_wasm.json
+++ b/test/code_size/embind_hello_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 552,
"a.html.gz": 380,
"a.js": 7266,
"a.js.gz": 3321,
"a.wasm": 7300,
- "a.wasm.gz": 3348,
+ "a.wasm.gz": 3349,
"total": 15118,
- "total_gz": 7049
+ "total_gz": 7050
}
diff --git a/test/code_size/embind_val_wasm.json b/test/code_size/embind_val_wasm.json
index 542c19cf1f..34c9da070e 100644
--- a/test/code_size/embind_val_wasm.json
+++ b/test/code_size/embind_val_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 552,
"a.html.gz": 380,
"a.js": 5367,
"a.js.gz": 2540,
"a.wasm": 9101,
- "a.wasm.gz": 4699,
+ "a.wasm.gz": 4705,
"total": 15020,
- "total_gz": 7619
+ "total_gz": 7625
}
diff --git a/test/code_size/hello_wasm_worker_wasm.json b/test/code_size/hello_wasm_worker_wasm.json
index 4b31168c56..37fd223c48 100644
--- a/test/code_size/hello_wasm_worker_wasm.json
+++ b/test/code_size/hello_wasm_worker_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 519,
"a.html.gz": 364,
"a.js": 830,
"a.js.gz": 530,
"a.wasm": 1891,
- "a.wasm.gz": 1082,
+ "a.wasm.gz": 1083,
"total": 3240,
- "total_gz": 1976
+ "total_gz": 1977
}
diff --git a/test/code_size/hello_webgl2_wasm.json b/test/code_size/hello_webgl2_wasm.json
index c9afcedc35..b496ff409e 100644
--- a/test/code_size/hello_webgl2_wasm.json
+++ b/test/code_size/hello_webgl2_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 454,
"a.html.gz": 328,
"a.js": 4386,
"a.js.gz": 2252,
- "a.wasm": 8286,
- "a.wasm.gz": 5617,
- "total": 13126,
- "total_gz": 8197
+ "a.wasm": 8292,
+ "a.wasm.gz": 5618,
+ "total": 13132,
+ "total_gz": 8198
}
diff --git a/test/code_size/hello_webgl2_wasm2js.json b/test/code_size/hello_webgl2_wasm2js.json
index 89e28d08c8..3929ae6843 100644
--- a/test/code_size/hello_webgl2_wasm2js.json
+++ b/test/code_size/hello_webgl2_wasm2js.json
@@ -1,8 +1,8 @@
{
"a.html": 346,
"a.html.gz": 262,
"a.js": 18078,
- "a.js.gz": 9781,
+ "a.js.gz": 9784,
"total": 18424,
- "total_gz": 10043
+ "total_gz": 10046
}
diff --git a/test/code_size/hello_webgl_wasm.json b/test/code_size/hello_webgl_wasm.json
index 2e1ba8e7f8..86cce56c94 100644
--- a/test/code_size/hello_webgl_wasm.json
+++ b/test/code_size/hello_webgl_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 454,
"a.html.gz": 328,
"a.js": 3924,
"a.js.gz": 2092,
- "a.wasm": 8286,
- "a.wasm.gz": 5617,
- "total": 12664,
- "total_gz": 8037
+ "a.wasm": 8292,
+ "a.wasm.gz": 5618,
+ "total": 12670,
+ "total_gz": 8038
}
diff --git a/test/code_size/hello_webgl_wasm2js.json b/test/code_size/hello_webgl_wasm2js.json
index 3a2fcd28a1..c9d40e9826 100644
--- a/test/code_size/hello_webgl_wasm2js.json
+++ b/test/code_size/hello_webgl_wasm2js.json
@@ -1,8 +1,8 @@
{
"a.html": 346,
"a.html.gz": 262,
"a.js": 17605,
- "a.js.gz": 9614,
+ "a.js.gz": 9622,
"total": 17951,
- "total_gz": 9876
+ "total_gz": 9884
}
diff --git a/test/code_size/math_wasm.json b/test/code_size/math_wasm.json
index 9d06a35db4..568a479d2e 100644
--- a/test/code_size/math_wasm.json
+++ b/test/code_size/math_wasm.json
@@ -1,10 +1,10 @@
{
"a.html": 552,
"a.html.gz": 380,
"a.js": 110,
"a.js.gz": 125,
- "a.wasm": 2687,
- "a.wasm.gz": 1658,
- "total": 3349,
- "total_gz": 2163
+ "a.wasm": 2693,
+ "a.wasm.gz": 1662,
+ "total": 3355,
+ "total_gz": 2167
}
diff --git a/test/code_size/random_printf_wasm.json b/test/code_size/random_printf_wasm.json
index fa6667bef3..66b36767c9 100644
--- a/test/code_size/random_printf_wasm.json
+++ b/test/code_size/random_printf_wasm.json
@@ -1,6 +1,6 @@
{
"a.html": 12515,
- "a.html.gz": 6857,
+ "a.html.gz": 6858,
"total": 12515,
- "total_gz": 6857
+ "total_gz": 6858
}
diff --git a/test/code_size/random_printf_wasm2js.json b/test/code_size/random_printf_wasm2js.json
index 87d6dfdb9a..29024d0de1 100644
--- a/test/code_size/random_printf_wasm2js.json
+++ b/test/code_size/random_printf_wasm2js.json
@@ -1,6 +1,6 @@
{
- "a.html": 17224,
- "a.html.gz": 7558,
- "total": 17224,
- "total_gz": 7558
+ "a.html": 17228,
+ "a.html.gz": 7560,
+ "total": 17228,
+ "total_gz": 7560
}
diff --git a/test/other/codesize/test_codesize_cxx_ctors1.size b/test/other/codesize/test_codesize_cxx_ctors1.size
index 4cd9784974..9f7ea6c403 100644
--- a/test/other/codesize/test_codesize_cxx_ctors1.size
+++ b/test/other/codesize/test_codesize_cxx_ctors1.size
@@ -1 +1 @@
-129523
+129499
diff --git a/test/other/codesize/test_codesize_cxx_ctors2.size b/test/other/codesize/test_codesize_cxx_ctors2.size
index 825f9c99dd..82fb741f2b 100644
--- a/test/other/codesize/test_codesize_cxx_ctors2.size
+++ b/test/other/codesize/test_codesize_cxx_ctors2.size
@@ -1 +1 @@
-128951
+128927
diff --git a/test/other/codesize/test_codesize_cxx_except.size b/test/other/codesize/test_codesize_cxx_except.size
index 9c576d550e..c0b75c2973 100644
--- a/test/other/codesize/test_codesize_cxx_except.size
+++ b/test/other/codesize/test_codesize_cxx_except.size
@@ -1 +1 @@
-171291
+171264
diff --git a/test/other/codesize/test_codesize_cxx_except_wasm.size b/test/other/codesize/test_codesize_cxx_except_wasm.size
index 73186ab753..1db904cf3c 100644
--- a/test/other/codesize/test_codesize_cxx_except_wasm.size
+++ b/test/other/codesize/test_codesize_cxx_except_wasm.size
@@ -1 +1 @@
-144653
+144594
diff --git a/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size b/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size
index 17dc029298..c53f6bcc31 100644
--- a/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size
+++ b/test/other/codesize/test_codesize_cxx_except_wasm_legacy.size
@@ -1 +1 @@
-142242
+142183
diff --git a/test/other/codesize/test_codesize_cxx_lto.size b/test/other/codesize/test_codesize_cxx_lto.size
index 50420f0c57..0a9a5f1e95 100644
--- a/test/other/codesize/test_codesize_cxx_lto.size
+++ b/test/other/codesize/test_codesize_cxx_lto.size
@@ -1 +1 @@
-121790
+121789
diff --git a/test/other/codesize/test_codesize_cxx_mangle.size b/test/other/codesize/test_codesize_cxx_mangle.size
index 06a97f0e9f..f3ed20a321 100644
--- a/test/other/codesize/test_codesize_cxx_mangle.size
+++ b/test/other/codesize/test_codesize_cxx_mangle.size
@@ -1 +1 @@
-235338
+235311
diff --git a/test/other/codesize/test_codesize_cxx_noexcept.size b/test/other/codesize/test_codesize_cxx_noexcept.size
index d502821bac..01e6459671 100644
--- a/test/other/codesize/test_codesize_cxx_noexcept.size
+++ b/test/other/codesize/test_codesize_cxx_noexcept.size
@@ -1 +1 @@
-131941
+131917
diff --git a/test/other/codesize/test_codesize_cxx_wasmfs.size b/test/other/codesize/test_codesize_cxx_wasmfs.size
index 24149d6189..0709f692df 100644
--- a/test/other/codesize/test_codesize_cxx_wasmfs.size
+++ b/test/other/codesize/test_codesize_cxx_wasmfs.size
@@ -1 +1 @@
-169798
+169789
diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size
index 3c46971663..98177161f2 100644
--- a/test/other/codesize/test_codesize_files_wasmfs.size
+++ b/test/other/codesize/test_codesize_files_wasmfs.size
@@ -1 +1 @@
-50330
+50354
diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size
index b339887848..2cbd341de0 100644
--- a/test/other/codesize/test_codesize_hello_O3.size
+++ b/test/other/codesize/test_codesize_hello_O3.size
@@ -1 +1 @@
-1733
+1679
diff --git a/test/other/codesize/test_codesize_hello_Os.size b/test/other/codesize/test_codesize_hello_Os.size
index a858d2d47b..ad0b314c27 100644
--- a/test/other/codesize/test_codesize_hello_Os.size
+++ b/test/other/codesize/test_codesize_hello_Os.size
@@ -1 +1 @@
-1724
+1722
diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size
index 771034cb6a..b0536459d7 100644
--- a/test/other/codesize/test_codesize_hello_Oz.size
+++ b/test/other/codesize/test_codesize_hello_Oz.size
@@ -1 +1 @@
-1259
+1205
diff --git a/test/other/codesize/test_codesize_hello_dylink.size b/test/other/codesize/test_codesize_hello_dylink.size
index afd43f8827..c36f42d794 100644
--- a/test/other/codesize/test_codesize_hello_dylink.size
+++ b/test/other/codesize/test_codesize_hello_dylink.size
@@ -1 +1 @@
-18547
+18521
diff --git a/test/other/codesize/test_codesize_hello_single_file.gzsize b/test/other/codesize/test_codesize_hello_single_file.gzsize
index 64d519daab..13f3698f99 100644
--- a/test/other/codesize/test_codesize_hello_single_file.gzsize
+++ b/test/other/codesize/test_codesize_hello_single_file.gzsize
@@ -1 +1 @@
-3620
+3587
diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize
index 8755c7be20..b0b20f1984 100644
--- a/test/other/codesize/test_codesize_hello_single_file.jssize
+++ b/test/other/codesize/test_codesize_hello_single_file.jssize
@@ -1 +1 @@
-6611
+6539
diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size
index b339887848..2cbd341de0 100644
--- a/test/other/codesize/test_codesize_hello_wasmfs.size
+++ b/test/other/codesize/test_codesize_hello_wasmfs.size
@@ -1 +1 @@
-1733
+1679
diff --git a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size
index 7193414dbf..cb15afe743 100644
--- a/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size
+++ b/test/other/codesize/test_codesize_mem_O3_standalone_narg_flto.size
@@ -1 +1 @@
-4097
+4101
diff --git a/test/other/codesize/test_codesize_minimal_pthreads.size b/test/other/codesize/test_codesize_minimal_pthreads.size
index 45f705e322..f637d9f6da 100644
--- a/test/other/codesize/test_codesize_minimal_pthreads.size
+++ b/test/other/codesize/test_codesize_minimal_pthreads.size
@@ -1 +1 @@
-19417
+19422
diff --git a/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size b/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size
index 11826fc9de..7baaed78af 100644
--- a/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size
+++ b/test/other/codesize/test_codesize_minimal_pthreads_memgrowth.size
@@ -1 +1 @@
-19418
+19423
The results there are mixed, but e.g. hello world -O3 is 3% smaller.
This may may sense to land, but
- The cases where the results are worse should be investigated.
- The compilation slowdown should be mitigated - perhaps we can pick more carefully when to use locals.
Last commits make this only localize when we see a falling-through constant, which we take as a sign that it is worth adding locals in the hopes of later optimizations finding things. That is enough for the motivating use cases. This makes the pass take almost 0 time, so speed is no longer an issue. It reduces the changes to real-world code, leaving in Emscripten this:
diff --git a/test/other/codesize/test_codesize_files_wasmfs.size b/test/other/codesize/test_codesize_files_wasmfs.size
index 3c46971663..002a2213dd 100644
--- a/test/other/codesize/test_codesize_files_wasmfs.size
+++ b/test/other/codesize/test_codesize_files_wasmfs.size
@@ -1 +1 @@
-50330
+50336
diff --git a/test/other/codesize/test_codesize_hello_O3.size b/test/other/codesize/test_codesize_hello_O3.size
index b339887848..357a340dae 100644
--- a/test/other/codesize/test_codesize_hello_O3.size
+++ b/test/other/codesize/test_codesize_hello_O3.size
@@ -1 +1 @@
-1733
+1681
diff --git a/test/other/codesize/test_codesize_hello_Oz.size b/test/other/codesize/test_codesize_hello_Oz.size
index 771034cb6a..de0cde04c8 100644
--- a/test/other/codesize/test_codesize_hello_Oz.size
+++ b/test/other/codesize/test_codesize_hello_Oz.size
@@ -1 +1 @@
-1259
+1207
diff --git a/test/other/codesize/test_codesize_hello_single_file.gzsize b/test/other/codesize/test_codesize_hello_single_file.gzsize
index 64d519daab..468cbfccfe 100644
--- a/test/other/codesize/test_codesize_hello_single_file.gzsize
+++ b/test/other/codesize/test_codesize_hello_single_file.gzsize
@@ -1 +1 @@
-3620
+3589
diff --git a/test/other/codesize/test_codesize_hello_single_file.jssize b/test/other/codesize/test_codesize_hello_single_file.jssize
index 8755c7be20..aff74eed6b 100644
--- a/test/other/codesize/test_codesize_hello_single_file.jssize
+++ b/test/other/codesize/test_codesize_hello_single_file.jssize
@@ -1 +1 @@
-6611
+6543
diff --git a/test/other/codesize/test_codesize_hello_wasmfs.size b/test/other/codesize/test_codesize_hello_wasmfs.size
index b339887848..357a340dae 100644
--- a/test/other/codesize/test_codesize_hello_wasmfs.size
+++ b/test/other/codesize/test_codesize_hello_wasmfs.size
@@ -1 +1 @@
-1733
+1681
Still not 100% positive, but mostly so, and keeps that 3% win on hello world -O3.
I investigated that small regression, and it is basically noise. Looking at more real-world things, on most there is a few bytes of noise one way or the other, but it does help a bit sometimes (50 bytes on LZMA, which is 0.1%), and sometimes by more than a bit (6% better on Poppler).
Overall this looks like it might be worth landing. We can consider expanding what it does later (more than Binary, and more things than falling-through constants).
Looks good. However, the evaluation focuses on compile-time metrics like code size and compilation time. Since this pass introduces more local get/set operations, which could affect runtime performance. Have we considered measuring runtime impact to check for any regressions?
The locals that it adds should get removed by later passes (unless they end up important). In theory that could be missed by the data I reported above (size could be smaller while local operations increase, if something else decreases enough), but it's unlikely. Here is the diff for hello world:
12c12
< [total] : 797
---
> [total] : 770
14,16c14,16
< Binary : 94
< Block : 45
< Break : 50
---
> Binary : 91
> Block : 41
> Break : 45
19,20c19,20
< Const : 137
< Drop : 8
---
> Const : 132
> Drop : 7
24,27c24,27
< Load : 75
< LocalGet : 198
< LocalSet : 67
< Loop : 11
---
> Load : 73
> LocalGet : 196
> LocalSet : 66
> Loop : 10
34c34
< Unary : 11
---
> Unary : 8
Everything decreases there.
Note also that adding more local operations usually does not affect runtime performance. VMs lower local operations into SSA form, which optimizes away unneeded operations. (Though interpreters and baseline tiers do less, and might be slower.)