substrate icon indicating copy to clipboard operation
substrate copied to clipboard

WIP: Replace `wasm-gc` with `wasm-opt`

Open athei opened this issue 1 year ago • 27 comments

This replaces the outdated wasm-gc crate by the new wasm-opt crate for post processing of the runtime blob. The latter is a safe binding for the tried and true binaryen. For now we run with optimization level -O0 to just do simple garbage collection until we are confident enough in the applied optimizations.

Let's run some benchmarks.

Possible due to https://github.com/w3f/Grants-Program/blob/master/applications/wasm-opt-for-rust.md

cc @Robbepop closes #1262

athei avatar Sep 15 '22 15:09 athei

This will close https://github.com/paritytech/substrate/issues/1262 if merged

nazar-pc avatar Sep 17 '22 00:09 nazar-pc

Rebased and removed changes to weight files in order to avoid merge conflicts.

athei avatar Sep 21 '22 20:09 athei

Some more numbers (release and production are for the runtime from substrate with benchmarking enabled; polkadot is for the polkadot runtime compiled under production; sorry, the table is kind of a chungus):

kind functions size size_compressed time has name has DWARF
polkadot 3005 5330k 1264k 0s true true
polkadot wasm-gc 3005 4884k 1203k 0s true false
polkadot wasm-gc --no-demangle 3005 4945k 1210k 0s true false
polkadot wasm-opt -O0 3005 4716k 1170k 0s false true
polkadot wasm-opt -O1 2936 4520k 1147k 1s false true
polkadot wasm-opt -O2 2597 4504k 1156k 6s false true
polkadot wasm-opt -O3 2592 4476k 1163k 9s false true
polkadot wasm-opt -O4 2593 4500k 1168k 17s false true
polkadot wasm-opt -Os 2594 4471k 1162k 7s false true
polkadot wasm-opt -Oz 2594 4472k 1161k 9s false true
polkadot wasm-opt -O0 --strip-dwarf 3005 4570k 1132k 0s false false
polkadot wasm-opt -O1 --strip-dwarf 2936 4375k 1109k 1s false false
polkadot wasm-opt -O2 --strip-dwarf 2597 4364k 1120k 6s false false
polkadot wasm-opt -O3 --strip-dwarf 2592 4335k 1125k 9s false false
polkadot wasm-opt -O4 --strip-dwarf 2593 4359k 1133k 17s false false
polkadot wasm-opt -Os --strip-dwarf 2594 4331k 1124k 7s false false
polkadot wasm-opt -Oz --strip-dwarf 2594 4331k 1124k 9s false false
polkadot wasm-opt -O0 -g 3005 5110k 1258k 2s true true
polkadot wasm-opt -O1 -g 3005 4939k 1237k 3s true true
polkadot wasm-opt -O2 -g 3005 4931k 1234k 3s true true
polkadot wasm-opt -O3 -g 3005 4919k 1234k 4s true true
polkadot wasm-opt -O4 -g 3005 4919k 1234k 4s true true
polkadot wasm-opt -Os -g 3005 4914k 1233k 3s true true
polkadot wasm-opt -Oz -g 3005 4915k 1233k 4s true true
polkadot wasm-opt -O0 -g --strip-dwarf 3005 4941k 1206k 0s true false
polkadot wasm-opt -O1 -g --strip-dwarf 3005 4771k 1185k 1s true false
polkadot wasm-opt -O2 -g --strip-dwarf 3005 4763k 1182k 1s true false
polkadot wasm-opt -O3 -g --strip-dwarf 3005 4751k 1183k 2s true false
polkadot wasm-opt -O4 -g --strip-dwarf 3005 4751k 1183k 2s true false
polkadot wasm-opt -Os -g --strip-dwarf 3005 4746k 1182k 2s true false
polkadot wasm-opt -Oz -g --strip-dwarf 3005 4747k 1182k 2s true false
production 3907 6843k 1694k 0s true true
production wasm-gc 3907 6617k 1636k 0s true false
production wasm-gc --no-demangle 3907 6696k 1645k 0s true false
production wasm-opt -O0 3907 6406k 1594k 0s false true
production wasm-opt -O1 3832 6140k 1566k 1s false true
production wasm-opt -O2 3463 6122k 1585k 7s false true
production wasm-opt -O3 3458 6070k 1591k 14s false true
production wasm-opt -O4 3458 6108k 1602k 23s false true
production wasm-opt -Os 3459 6063k 1591k 10s false true
production wasm-opt -Oz 3459 6063k 1590k 14s false true
production wasm-opt -O0 --strip-dwarf 3907 6228k 1547k 0s false false
production wasm-opt -O1 --strip-dwarf 3832 5962k 1517k 1s false false
production wasm-opt -O2 --strip-dwarf 3463 5949k 1538k 6s false false
production wasm-opt -O3 --strip-dwarf 3458 5896k 1548k 13s false false
production wasm-opt -O4 --strip-dwarf 3458 5933k 1558k 22s false false
production wasm-opt -Os --strip-dwarf 3459 5889k 1547k 10s false false
production wasm-opt -Oz --strip-dwarf 3459 5890k 1546k 13s false false
production wasm-opt -O0 -g 3907 6882k 1695k 3s true true
production wasm-opt -O1 -g 3907 6642k 1670k 4s true true
production wasm-opt -O2 -g 3907 6632k 1665k 4s true true
production wasm-opt -O3 -g 3907 6601k 1663k 5s true true
production wasm-opt -O4 -g 3907 6601k 1663k 5s true true
production wasm-opt -Os -g 3907 6594k 1662k 5s true true
production wasm-opt -Oz -g 3907 6595k 1661k 5s true true
production wasm-opt -O0 -g --strip-dwarf 3907 6691k 1640k 0s true false
production wasm-opt -O1 -g --strip-dwarf 3907 6452k 1614k 1s true false
production wasm-opt -O2 -g --strip-dwarf 3907 6442k 1610k 1s true false
production wasm-opt -O3 -g --strip-dwarf 3907 6412k 1608k 2s true false
production wasm-opt -O4 -g --strip-dwarf 3907 6412k 1608k 2s true false
production wasm-opt -Os -g --strip-dwarf 3907 6404k 1608k 2s true false
production wasm-opt -Oz -g --strip-dwarf 3907 6406k 1608k 2s true false
release 8752 9339k 1914k 0s true true
release wasm-gc 8752 8989k 1841k 0s true false
release wasm-gc --no-demangle 8752 9166k 1859k 0s true false
release wasm-opt -O0 8752 8347k 1738k 0s false true
release wasm-opt -O1 8044 7807k 1706k 1s false true
release wasm-opt -O2 3993 7658k 1862k 7s false true
release wasm-opt -O3 3895 7561k 1851k 45s false true
release wasm-opt -O4 3889 7601k 1873k 55s false true
release wasm-opt -Os 3951 7505k 1847k 35s false true
release wasm-opt -Oz 3951 7505k 1855k 37s false true
release wasm-opt -O0 --strip-dwarf 8752 8099k 1684k 0s false false
release wasm-opt -O1 --strip-dwarf 8044 7564k 1648k 1s false false
release wasm-opt -O2 --strip-dwarf 3993 7453k 1809k 7s false false
release wasm-opt -O3 --strip-dwarf 3895 7356k 1797k 44s false false
release wasm-opt -O4 --strip-dwarf 3889 7395k 1821k 55s false false
release wasm-opt -Os --strip-dwarf 3951 7300k 1794k 35s false false
release wasm-opt -Oz --strip-dwarf 3951 7300k 1804k 37s false false
release wasm-opt -O0 -g 8752 9421k 1918k 4s true true
release wasm-opt -O1 -g 8752 9174k 1901k 5s true true
release wasm-opt -O2 -g 8752 9162k 1896k 5s true true
release wasm-opt -O3 -g 8752 9075k 1895k 6s true true
release wasm-opt -O4 -g 8752 9075k 1895k 6s true true
release wasm-opt -Os -g 8752 9066k 1893k 6s true true
release wasm-opt -Oz -g 8752 9066k 1893k 6s true true
release wasm-opt -O0 -g --strip-dwarf 8752 9159k 1855k 0s true false
release wasm-opt -O1 -g --strip-dwarf 8752 8913k 1831k 1s true false
release wasm-opt -O2 -g --strip-dwarf 8752 8902k 1829k 1s true false
release wasm-opt -O3 -g --strip-dwarf 8752 8814k 1828k 2s true false
release wasm-opt -O4 -g --strip-dwarf 8752 8814k 1828k 2s true false
release wasm-opt -Os -g --strip-dwarf 8752 8805k 1826k 2s true false
release wasm-opt -Oz -g --strip-dwarf 8752 8805k 1826k 2s true false

Some observations:

  • When ran without -g and --strip-dwarf the name section is stripped, but the DWARF sections are kept.
  • When ran with -g the name section is kept.
  • The name section is used when decoding symbols so that e.g. instead of _wasm_function_1234 we get wasmi_validation::func::FunctionValidationContext::validate_unop::ha8ecb960f7924be1.
  • wasm-gc leaves the name section, but strips the DWARF sections.
  • Specifying -g disables most of wasm-opt's transformations.
  • I have no idea whether the DWARF sections are correct after wasm-opt's transformations; it's possible they are garbage. It seems strange that having wasm-opt emit the name section (which should be very easy to fixup after transformations) disables most of its transformations, but having it emit the DWARF sections (which should be harder to fixup) do not.
  • The smallest compressed size we get with wasm-opt -O1 --strip-dwarf
  • A direct replacement of our current use of wasm-gc would be wasm-opt -O0 -g --strip-dwarf, which is only 3k bigger compressed for polkadot.

Performance-wise it's a little tricky to compare. I tried to generate a matrix of number for one of the contracts' benchmarks, but that's not really possible due to wasmtime's loop misalignment bug randomly tanking the performance. One interesting thing I've found for those combinations which are not affected by the bug is this (lower is better):

  • pallet_contracts/seal_return_per_kb in release configuration: ~225
  • pallet_contracts/seal_return_per_kb in release + O2 and release + O3 configuration: ~215
  • pallet_contracts/seal_return_per_kb in production configuration: ~191
  • pallet_contracts/seal_return_per_kb in production + O3 configuration: ~211

So when built under the release profile we get a slightly a slight performance bump when using wasm-opt -O2, but when we build the runtime under the production profile we get a performance regression when using wasm-opt -O3. Of course, this is just one data point with one benchmark on one machine, but this does suggest that performance-wise it's not necessarily a slam dunk.

Also I'm a little nervous about is the correctness of the transformations that the wasm-opt is doing here, especially at more niche levels like -Oz. I'm not necessarily saying it's going to be buggy, but it does vastly more than wasm-gc so I'd like some stronger guarantee that it's not going to cause problems.

So if we decide to introduce this I think I'd like to run a full sync burn-in with wasm-opt hooked into the node. (Basically just hack the executor so that it preoptimizes any .wasm blob with wasm-opt that it loads before compiling it, and then sync a node from scratch and see if it's successful.) I can set such a burn-in up on one of my GCP machines provided we can come to consensus as to which optimization level we want to pick here.

Open questions

  • Do we want to keep the name section? Having this should make it possible to decode the symbols when a panic happens inside of WASM, but can cost up to ~80k compressed, and prohibits most of wasm-opt's transformations.
  • Do we want to keep the DWARF sections? This should mostly only be useful when hooking into the process with a debugger, I think?
  • Do we want to actually use wasm-opt with a non-zero optimization level? The benefits are kind-of dubious; it only nets us something like ~30k compressed while being potentially risky.

cc @pepyakin

koute avatar Sep 22 '22 15:09 koute

Do we want to actually use wasm-opt with a non-zero optimization level? The benefits are kind-of dubious; it only nets us something like ~30k compressed while being potentially risky.

According to @Robbepop newer wasmi versions depend on these optimizations (the wasmi code itself). Apart from that running with O0 as a first step sounds sensible. We can change the level in a follow up?

athei avatar Sep 23 '22 06:09 athei

According to @Robbepop newer wasmi versions depend on these optimizations (the wasmi code itself). Apart from that running with O0 as a first step sounds sensible. We can change the level in a follow up?

Ah, yes, right, I remember that. In which case can we just switch to a wasm-gc-equivalent configuration for now (which should be safe enough to do as-is), and potentially bump the optimization level after we upgrade wasmi?

koute avatar Sep 23 '22 07:09 koute

A direct replacement of our current use of wasm-gc would be wasm-opt -O0 -g --strip-dwarf, which is only 3k bigger compressed for polkadot

But do we really want to keep the name section for a production build?

athei avatar Sep 23 '22 07:09 athei

A direct replacement of our current use of wasm-gc would be wasm-opt -O0 -g --strip-dwarf, which is only 3k bigger compressed for polkadot

But do we really want to keep the name section for a production build?

This is a good question. Being able to decode the symbols in case of an e.g. panic is something that is nice to have, and the size difference compressed is not that big. Personally I'd probably keep it, but I'm willing to be convinced otherwise, however in that case I'd insist on updating our release process so that fully unstripped runtime blobs also be posted in our releases alongside the .compact.compressed.wasm. I don't want to end up in a situation where I have to debug a runtime issue and will have no way of knowing what is what without reverse engineering the bytecode to figure out which function is which.

koute avatar Sep 23 '22 07:09 koute

Sounds good. Let's just go with a wasm-gc equivalent config and then investigate more later.

athei avatar Sep 23 '22 07:09 athei

Being able to decode the symbols in case of an e.g. panic is something that is nice to have, and the size difference compressed is not that big

100%. Please keep the name sections. Stripping them will just make our live harder. All the work it would require on any failure to get it running again with the correct name section found in some release, so that we get some usable stack trace. IMO all this hassle for some saved space isn't worth it.

bkchr avatar Sep 23 '22 12:09 bkchr

Being able to decode the symbols in case of an e.g. panic is something that is nice to have, and the size difference compressed is not that big

100%. Please keep the name sections. Stripping them will just make our live harder. All the work it would require on any failure to get it running again with the correct name section found in some release, so that we get some usable stack trace. IMO all this hassle for some saved space isn't worth it.

Technically you wouldn't have to reproduce the problem again, I think? You could extract the 'number -> name' mappings from the unstripped blob and then just run a search & replace on the logs/panic backtrace. It'd be a pain, but it should be doable.

That said, I definitely agree that I also prefer to have the symbols just be decoded by default. We're not the only people building on substrate, so we should make the out-of-box development experience as good as we can.

koute avatar Sep 23 '22 12:09 koute

Technically you wouldn't have to reproduce the problem again, I think? You could extract the 'number -> name' mappings from the unstripped blob and then just run a search & replace on the logs/panic backtrace. It'd be a pain, but it should be doable.

Yeah got point, still sounds like something I don't want to do :P

That said, I definitely agree that I also prefer to have the symbols just be decoded by default. We're not the only people building on substrate, so we should make the out-of-box development experience as good as we can.

This is always good to repeat! :D

bkchr avatar Sep 23 '22 12:09 bkchr

Yes we should keep the name section. However, wasm-opt seems to always remove it no matter what I set debug_info to. I guess we need to wait until this binding matures a bit.

athei avatar Sep 25 '22 09:09 athei

Author of the wasm-opt bindings here. I'm glad to see another project interested in using wasm-opt-rs. There are a lot of bugs in the -preview release of the crate, and I expect to make a real release in a matter of days that is much better tested.

brson avatar Sep 30 '22 13:09 brson

@athei I have published wasm-opt crate 0.110.0: https://docs.rs/wasm-opt/latest/wasm_opt/

This version is complete and reasonably well tested for identical behavior to the wasm-opt CLI. It would be reasonable to try to integrate it into wasm-builder. I am going to do the same for cargo-contract right now.

There are a few things to be aware of:

  • It depends on Rust 1.48+
  • It depends on a c++17 compiler, and should emit a readable error if one is not available.
  • The wasm-opt-sys crate takes a non-negligible amount of time to build (minutes on my underpowered laptop). It also does not do any incremental recompilation, so if the build is invalidated it will rebuild the C++ code from scratch. The lack of incremental recompilation is a limitation self-imposed by not using cmake or other external build system.
  • wasm-opt on windows does not support extended unicode paths (probably anything non-ascii, maybe latin-1, it's unclear). This is a limitation of binaryen and not a regression of the bindings. It may or may not be fixed in the future.
  • cargo tarpaulin (code coverage) segfaults running any wasm-opt crates (https://github.com/brson/wasm-opt-rs/issues/59), reason unknown. This behavior could infect other crates that link to wasm-opt. If you use tarpaulin, you might check it.

brson avatar Oct 05 '22 00:10 brson

Here is the cargo-contract PR: https://github.com/paritytech/cargo-contract/pull/766

brson avatar Oct 05 '22 00:10 brson

Updated to the newest version. The name section is kept now.

athei avatar Oct 05 '22 06:10 athei

Comparing a production build of the kitchensink runtime.

(The underlying .wasm blob is not completely the same so some of the size differences before-after are because of that.)

With wasm-gc (4896564 bytes)

  types                                  |        0xb -      0x355 |       842 bytes | 100 count
  imports                                |      0x358 -      0xc51 |      2297 bytes | 55 count
  functions                              |      0xc54 -     0x1800 |      2988 bytes | 2986 count
  tables                                 |     0x1802 -     0x1809 |         7 bytes | 1 count
  globals                                |     0x180b -     0x1824 |        25 bytes | 3 count
  exports                                |     0x1827 -     0x1db6 |      1423 bytes | 44 count
  elements                               |     0x1db9 -     0x2564 |      1963 bytes | 1 count
  code                                   |     0x2569 -   0x40775f |   4215286 bytes | 2986 count
  data                                   |   0x407763 -   0x46036e |    363531 bytes | 3 count
  custom "runtime_apis"                  |   0x46037e -   0x460432 |       180 bytes | 1 count
  custom "runtime_version"               |   0x460444 -   0x46046a |        38 bytes | 1 count
  custom "name"                          |   0x460473 -   0x4ab6dd |    307818 bytes | 1 count
  custom "producers"                     |   0x4ab6e9 -   0x4ab734 |        75 bytes | 1 count

With wasm-opt (5142894 bytes)

  types                                  |        0xb -      0x355 |       842 bytes | 100 count
  imports                                |      0x358 -      0xc51 |      2297 bytes | 55 count
  functions                              |      0xc54 -     0x1800 |      2988 bytes | 2986 count
  tables                                 |     0x1802 -     0x1809 |         7 bytes | 1 count
  memories                               |     0x180b -     0x180c |         1 bytes | 0 count
  globals                                |     0x180e -     0x1827 |        25 bytes | 3 count
  exports                                |     0x182a -     0x1db9 |      1423 bytes | 44 count
  elements                               |     0x1dbc -     0x2567 |      1963 bytes | 1 count
  code                                   |     0x256c -   0x40fb54 |   4249064 bytes | 2986 count
  data                                   |   0x40fb58 -   0x468743 |    363499 bytes | 3 count
  custom "name"                          |   0x46874c -   0x4c2b2e |    369634 bytes | 1 count
  custom ".debug_info"                   |   0x4c2b3e -   0x4caa76 |     32568 bytes | 1 count
  custom ".debug_pubtypes"               |   0x4caa89 -   0x4cac03 |       378 bytes | 1 count
  custom "runtime_apis"                  |   0x4cac13 -   0x4cacc7 |       180 bytes | 1 count
  custom ".debug_ranges"                 |   0x4cacd8 -   0x4cd140 |      9320 bytes | 1 count
  custom ".debug_abbrev"                 |   0x4cd151 -   0x4cdc86 |      2869 bytes | 1 count
  custom "runtime_version"               |   0x4cdc98 -   0x4cdcbe |        38 bytes | 1 count
  custom ".debug_line"                   |   0x4cdcce -   0x4d3bda |     24332 bytes | 1 count
  custom ".debug_str"                    |   0x4d3be9 -   0x4e2961 |     60792 bytes | 1 count
  custom ".debug_pubnames"               |   0x4e2975 -   0x4e7917 |     20386 bytes | 1 count
  custom "producers"                     |   0x4e7923 -   0x4e796e |        75 bytes | 1 count

Original .wasm blob before wasm-opt (5374662)

  types                                  |        0xb -      0x355 |       842 bytes | 100 count
  imports                                |      0x358 -      0xc51 |      2297 bytes | 55 count
  functions                              |      0xc54 -     0x1800 |      2988 bytes | 2986 count
  tables                                 |     0x1802 -     0x1809 |         7 bytes | 1 count
  globals                                |     0x180b -     0x1824 |        25 bytes | 3 count
  exports                                |     0x1827 -     0x1db6 |      1423 bytes | 44 count
  elements                               |     0x1db9 -     0x2564 |      1963 bytes | 1 count
  code                                   |     0x2569 -   0x4484ac |   4480835 bytes | 2986 count
  data                                   |   0x4484b0 -   0x4a109b |    363499 bytes | 3 count
  custom ".debug_info"                   |   0x4a10ab -   0x4a8fe3 |     32568 bytes | 1 count
  custom ".debug_pubtypes"               |   0x4a8ff6 -   0x4a9170 |       378 bytes | 1 count
  custom "runtime_apis"                  |   0x4a9180 -   0x4a9234 |       180 bytes | 1 count
  custom ".debug_ranges"                 |   0x4a9245 -   0x4ab6ad |      9320 bytes | 1 count
  custom ".debug_abbrev"                 |   0x4ab6be -   0x4ac1f3 |      2869 bytes | 1 count
  custom "runtime_version"               |   0x4ac205 -   0x4ac22b |        38 bytes | 1 count
  custom ".debug_line"                   |   0x4ac23b -   0x4b2147 |     24332 bytes | 1 count
  custom ".debug_str"                    |   0x4b2156 -   0x4c0ece |     60792 bytes | 1 count
  custom ".debug_pubnames"               |   0x4c0ee2 -   0x4c5e84 |     20386 bytes | 1 count
  custom "name"                          |   0x4c5e8d -   0x52026f |    369634 bytes | 1 count
  custom "producers"                     |   0x52027b -   0x5202c6 |        75 bytes | 1 count
  • Looks like wasm-opt is adding an empty memories section for some reason? I'm not entirely sure off the top of my head whether our code actually handles this case. I see the tests on the CI were randomly killed; we need to restart those jobs and see whether anything fails. If we're going to support such a configuration I'll have to add an explicit test to our executor tests to make sure this never breaks.
  • The name section is indeed still here, so that's good!
  • The debug sections are not stripped unlike wasm-gc.

koute avatar Oct 12 '22 15:10 koute

@kripken I wonder if you could look at the notes at the bottom of the last comment and comment on the memories section and debug sections.

brson avatar Oct 13 '22 17:10 brson

@koute is the goal to keep the name section but strip all the debug sections?

brson avatar Oct 13 '22 17:10 brson

(hi, wasm-opt dev here!)

Use the --strip-dwarf pass to remove the DWARF sections but not the names section. (--strip-debug would also strip the name section.)

I'm not sure what "memories section" refers to... which tool is reporting that output? If it's the memory section, then surely it must exist before wasm-opt runs, as this code uses a memory..? (all Rust code does) wasm-opt would never add a memory, it would only remove one if it is not used.

kripken avatar Oct 13 '22 18:10 kripken

With the builder, --strip-dwarf looks like it would be written .add_pass(Pass::StripDwarf).

brson avatar Oct 13 '22 18:10 brson

@koute is the goal to keep the name section but strip all the debug sections?

Yes. Currently we want to just get an wasm-gc-equivalent output.

I'm not sure what "memories section" refers to... which tool is reporting that output?

It's wasm-tools's objdump subcommand.

If it's the memory section, then surely it must exist before wasm-opt runs, as this code uses a memory..? (all Rust code does) wasm-opt would never add a memory, it would only remove one if it is not used.

@kripken Indeed. But it doesn't actually have a memory because it's importing the memory. But wasm-opt is adding an unnecessary empty section where memories would be if there were actually any.

$ echo '(module (import "env" "memory" (memory (;0;) 25)))' > test.wat
$ wat2wasm test.wat
$ wasm-opt test.wasm -o test-opt.wasm
warning: no passes specified, not doing any work
$ wasm-tools objdump test.wasm
  imports                                |        0xa -       0x19 |        15 bytes | 1 count
$ wasm-tools objdump test-opt.wasm
  imports                                |        0xa -       0x19 |        15 bytes | 1 count
  memories                               |       0x1b -       0x1c |         1 bytes | 0 count
$ md5sum test.wasm test-opt.wasm
68d233fe559fcedc137d4f015cfdc0af  test.wasm
9f19fffbb65493887087c341987342b0  test-opt.wasm
$ stat -c %s test.wasm
25
$ stat -c %s test-opt.wasm
28
$ wasm-tools dump test.wasm
  0x0 | 00 61 73 6d | version 1 (Module)
      | 01 00 00 00
  0x8 | 02 0f       | import section
  0xa | 01          | 1 count
  0xb | 03 65 6e 76 | import [memory 0] Import { module: "env", name: "memory", ty: Memory(MemoryType { memory64: false, shared: false, initial: 25, maximum: None }) }
      | 06 6d 65 6d
      | 6f 72 79 02
      | 00 19      
$ wasm-tools dump test-opt.wasm
  0x0 | 00 61 73 6d | version 1 (Module)
      | 01 00 00 00
  0x8 | 02 0f       | import section
  0xa | 01          | 1 count
  0xb | 03 65 6e 76 | import [memory 0] Import { module: "env", name: "memory", ty: Memory(MemoryType { memory64: false, shared: false, initial: 25, maximum: None }) }
      | 06 6d 65 6d
      | 6f 72 79 02
      | 00 19      
 0x19 | 05 01       | memory section
 0x1b | 00          | 0 count

So as you can see it adds an empty section for memories, even if there are no memories. This is observable, e.g.:

fn check(path: &str) {
    let module = parity_wasm::deserialize_file(path).unwrap();
    println!("Memories for file {:?}: {:?}", path, module.memory_section().map(|section| section.entries()));
}

fn main() {
    check("test.wasm");
    check("test-opt.wasm");
}

outputs the following:

Memories for file "test.wasm": None
Memories for file "test-opt.wasm": Some([])

which is why I said that I don't know if we properly handle this.

koute avatar Oct 14 '22 06:10 koute

Use the --strip-dwarf pass to remove the DWARF sections but not the names section. (--strip-debug would also strip the name section.)

With the builder, --strip-dwarf looks like it would be written .add_pass(Pass::StripDwarf).

Thanks. This is valuable information. I added the pass.

athei avatar Oct 14 '22 10:10 athei

@koute

wasm-opt is adding an unnecessary empty section where memories would be if there were actually any.

I see, thanks! That does look like a bug in wasm-opt that causes some wasted bytes. I'll fix it.

kripken avatar Oct 14 '22 15:10 kripken

I have published version 0.110.1 of the wasm-opt crate. It reduces the amount of Binaryen source code packaged with the crates from 72 MB to 10 MB.

brson avatar Oct 16 '22 00:10 brson

The unneeded memory section issue has been fixed in binaryen: https://github.com/WebAssembly/binaryen/pull/5145

kripken avatar Oct 17 '22 22:10 kripken

The unneeded memory section issue has been fixed in binaryen: WebAssembly/binaryen#5145

I can probably backport this to Binaryen 110, which wasm-opt-rs is already forking for another backport, and make a point release of the wasm-opt crate. It might take me some time though as I have a busy week.

https://github.com/brson/wasm-opt-rs/issues/105

brson avatar Oct 17 '22 23:10 brson

The just-published 0.110.2 contains the backport to remove the memories section.

brson avatar Oct 27 '22 20:10 brson

Thanks @brson!

I'll do one last check, and if everything's good this should be good to go.

koute avatar Oct 28 '22 13:10 koute

I updated the PR to the new version. Hopefully the CI will be green now.

athei avatar Oct 28 '22 13:10 athei