wasi-sdk icon indicating copy to clipboard operation
wasi-sdk copied to clipboard

Compiled WebAssembly artifacts are larger than those compiled using Emscripten

Open valadaptive opened this issue 5 months ago • 5 comments

I'm working on a JS library that relies heavily on C/C++ libraries compiled to WebAssembly. Currently I'm using Emscripten to do this, but I think wasi-sdk is probably the way forward (Emscripten does some pretty janky stuff).

Right now, the biggest blocker is that the WebAssembly libraries compiled by wasi-sdk are larger than their Emscripten-compiled counterparts. Here are the current WebAssembly artifacts for both; the wasi-sdk ones have a 2 appended to their names. The build.sh script builds the Emscripten artifacts whereas the makefile builds the wasi-sdk ones. Currently:

  • hb2.wasm is 698KiB vs hb.wasm's 661KiB.
  • woff1_2.wasm is 93KiB vs woff1.wasm's 78KiB.
  • woff2_2.wasm is 728KiB vs woff2.wasm's 712KiB.

I'm having a hard time figuring out where the extra file size is coming from. The only code size profiler I know of is twiggy, and it's going to be archived soon, and requires debug symbols to be included anyway which defeats the entire point.

In both cases, I'm compiling using -Oz, enabling LTO, and stripping all symbols.

My guess is that it's some combination of libc containing different code, libc being built with different compiler flags, and different WebAssembly target features being enabled. I have no idea how to pick those apart though.

valadaptive avatar Jul 24 '25 09:07 valadaptive

Are you running wasm-opt on the resulting binary?

I believe you can use bloaty to analyze the size wasm files. See https://github.com/google/bloaty.

sbc100 avatar Jul 24 '25 17:07 sbc100

wasm-opt reduces the size by ~1KiB only. I'll look at bloaty.

valadaptive avatar Jul 24 '25 19:07 valadaptive

Another way to profile binaries is wasm-opt --metrics and wasm-opt --func-metrics.

You may also want to compare wasm-opt flags between what emcc sends and what you are doing manually. See e.g.

https://github.com/WebAssembly/binaryen/wiki/Optimizer-Cookbook#low-memory-unused

kripken avatar Jul 24 '25 19:07 kripken

Seems like the emscripten-built version contains bulk memory operations (and other stuff that makes bloaty spit out error messages, although it won't tell me exactly why).

Compiling with -mbulk-memory doesn't seem to help (is that even the right compiler flag? Google turns up nothing.) I might need to build wasi-libc with different compiler flags, but I'm not sure how to do that either; when I tried building it with the MinSizeRel profile, it failed on an assertion step which specifically checks that you haven't changed the compiler flags at all.

valadaptive avatar Jul 24 '25 21:07 valadaptive

Comparing wasm-opt --metrics between the Emscripten version:

Metrics
total
 [exports]      : 73      
 [funcs]        : 1710    
 [globals]      : 1       
 [imports]      : 4       
 [memories]     : 1       
 [memory-data]  : 49066   
 [table-data]   : 169     
 [tables]       : 1       
 [tags]         : 0       
 [total]        : 319965  
 [vars]         : 5860    
 Binary         : 52389   
 Block          : 12803   
 Break          : 13662   
 Call           : 10366   
 CallIndirect   : 147     
 Const          : 50871   
 Drop           : 795     
 GlobalGet      : 429     
 GlobalSet      : 853     
 If             : 5821    
 Load           : 24309   
 LocalGet       : 89761   
 LocalSet       : 28817   
 Loop           : 2093    
 MemoryCopy     : 299     
 MemoryFill     : 447     
 MemoryGrow     : 1       
 MemorySize     : 2       
 RefFunc        : 169     
 Return         : 591     
 Select         : 1733    
 Store          : 16096   
 Switch         : 234     
 Unary          : 6356    
 Unreachable    : 921

and the wasi-sdk version:

Metrics
total
 [exports]      : 69      
 [funcs]        : 1702    
 [globals]      : 1       
 [imports]      : 7       
 [memories]     : 1       
 [memory-data]  : 50902   
 [table-data]   : 167     
 [tables]       : 1       
 [tags]         : 0       
 [total]        : 331778  
 [vars]         : 6004    
 Binary         : 55782   
 Block          : 12911   
 Break          : 13872   
 Call           : 10402   
 CallIndirect   : 149     
 Const          : 55463   
 Drop           : 777     
 GlobalGet      : 423     
 GlobalSet      : 844     
 If             : 5945    
 Load           : 24616   
 LocalGet       : 90763   
 LocalSet       : 30238   
 Loop           : 2106    
 MemoryCopy     : 327     
 MemoryFill     : 472     
 MemoryGrow     : 2       
 MemorySize     : 3       
 RefFunc        : 167     
 Return         : 593     
 Select         : 1713    
 Store          : 16423   
 Switch         : 233     
 Unary          : 6616    
 Unreachable    : 938

This is with the latest version of wasi-sdk, with the sysroot built using -DWASI_SDK_CPU_CFLAGS="-mcpu=lime1 -mbulk-memory". There are bulk memory operations in both; it looks like I just need to manually specify --enable-bulk-memory-opt in the former to avoid validation errors for some reason.

Nothing is really jumping out at me; it looks like the wasi-sdk version just...contains more code spread around everywhere. I also think I need to compile with debug info to get any useful stats out of bloaty, which again somewhat defeats the purpose since the debug info will be counted in the code size.

valadaptive avatar Aug 06 '25 03:08 valadaptive