wasi-sdk icon indicating copy to clipboard operation
wasi-sdk copied to clipboard

Size optimisation

Open ColinEberhardt opened this issue 5 years ago • 10 comments
trafficstars

I'm using this toolchain in order to create a simple wasm module for use with wagi. The wasm module size has a significant impact on the time it takes to respond to a request - as per https://github.com/deislabs/wagi/issues/3

With this simple example:

#include <iostream>

using namespace std;

int main() {
  cout << "Content-Type: text/plain\n";
  cout << "\n";
  cout << "hello world C++";
}

Compiled as follows:

/wasi-sdk-11.0/bin/clang++ --sysroot=/wasi-sdk-11.0/share/wasi-sysroot  test.cpp -o test.wasm

The test.wasm output is 606 KBytes.

If I optimise with wasm-opt:

wasm-opt -Oz test.wasm -o test.wasm

It drops to 209 KBytes, which is still a lot larger than I'd expect.

ColinEberhardt avatar Oct 27 '20 17:10 ColinEberhardt

We are always open to ideas on how to shrink the output binaries. If you want to help with that effort there are tools out there such as https://github.com/rustwasm/twiggy and https://github.com/google/bloaty which can help you figure where the space is going.

One thing you can try is building with -flto and (I think you can do this now) linking with the LTO versions of the standard libraries.

sbc100 avatar Oct 27 '20 17:10 sbc100

Most likely the majority the space is the overhead of using C++ streams which I believe transitively bring in a lot of stuff.

sbc100 avatar Oct 27 '20 17:10 sbc100

How does the code size change when adding -Oz to clang?

kripken avatar Oct 27 '20 17:10 kripken

How does the code size change when adding -Oz to clang?

Adding that drops from 606Kb to 604Kb

ColinEberhardt avatar Oct 27 '20 18:10 ColinEberhardt

The other major factor here is static linking, since WASI doesn't yet support dynamic linking -- though it is planned. For comparison, on a native toolchain, clang++ -Oz -static with the code above produces a 2.3M executable. So a combination of C++ <iostream> being a very code-size-unfriendly API and static linking are likely the main factors for this testcase.

sunfishcode avatar Oct 27 '20 18:10 sunfishcode

Those two are definitely valid issues, but I suspect there is some big low-hanging fruit here, though. 209Kb for an optimized build is still pretty high.

For comparison, unoptimized emcc emits 157Kb on this (so there is more than optimization going on here), and emcc -Oz emits 129K (so optimizations can help a lot).

It's possible this could be improved either in the compiled libraries here, or perhaps we can find interesting new optimizations for wasm-opt to do on wasi-sdk output specifically.

kripken avatar Oct 27 '20 18:10 kripken

Another thing that might help is LTO; https://github.com/WebAssembly/wasi-sdk/pull/89 implements LTO in wasi-sdk in a way that includes libcxx and libc, which should enable more aggressive dead-code elimination. That PR is a little out of date at the moment, but hopefully I'll have time to get back to it soon.

sunfishcode avatar Oct 27 '20 19:10 sunfishcode

Another thing that might help is LTO; #89 implements LTO in wasi-sdk in a way that includes libcxx and libc, which should enable more aggressive dead-code elimination. That PR is a little out of date at the moment, but hopefully I'll have time to get back to it soon.

Hello , I am face to some similar size issue, LTO can be interesting for size reduction, did you observe some gain ? Is LTO is plan to be deliver :) ?

ghost avatar May 07 '21 18:05 ghost

I do hope to finish #89 at some point, but don't have a specific timeline.

sunfishcode avatar May 12 '21 17:05 sunfishcode

Something literally just broke for me this week.

For some reason brew install binaryen on MacOS is dropping in Binaryen https://github.com/WebAssembly/binaryen/releases/tag/version_108

But Github Actions is still on version_91 for Ubuntu-20 LTS. My build works on the older version with messages: unknown name subsection at 22675776 unknown name subsection at 22675796

Which I was ignoring an enjoying a <1MB file size for Quake3e-WASM. Awesome!

Last week however, Binaryen in MacOS stopped shrinking my files. Now it doesn't do anything. Let me guess! Someone added some obscure, nuanced, command line option, that I have to dig through 6 months of changelogs to figure out!

This file size problem:

Screen Shot 2022-06-14 at 7 35 22 PM

This problem is the most ridiculous nuisance I have dealt with the entire time using WASI. This entire platform is held back from becoming mainstream because of this horrible problem. I mean, no developer in their right mind, is going to use WebAssembly in production, when the default compile settings output files with huge amounts of zeros.

Meanwhile, with an older version of wasm-opt: quake3e_opengl2_js.wasm - 416KB ^ Still seems a little big, but I can live with that.

YES!

--zero-filled-memory

Incredible. It appears the assumption is, hardly anyone is compiling directly to WASM without Emscripten. I hope this becomes more mainstream.

ghost avatar Jun 15 '22 02:06 ghost

One issue I've noticed recently is that the debug and DWARF custom sections can be quite large; not sure if posters above found this to be an issue but I typically resolve this with wasm-opt --strip-debug --strip-dwarf. I'm going to close this issue since the LTO work is tracked elsewhere; we can re-open it if there's more to be done here.

abrown avatar Mar 17 '23 17:03 abrown