pyodide-pack icon indicating copy to clipboard operation
pyodide-pack copied to clipboard

Explore the size reduction of Emscripten shared objects

Open agriyakhetarpal opened this issue 10 months ago • 2 comments

There are a bunch of tools we could run over the object files in wheels for this, but wasm-opt and wasm-dce come up in search results in the first go. I don't know enough about how to do this, so I'm documenting some resources here for our reference.

  • https://hacks.mozilla.org/2018/01/shrinking-webassembly-and-javascript-code-sizes-in-emscripten/
  • https://github.com/xtuc/webassemblyjs/tree/master/packages/dce
  • https://github.com/gonowa/wasm-opt
  • https://kripken.github.io/emscripten-site/docs/optimizing/Optimizing-Code.html#code-size
  • https://webassembly.github.io/wabt/doc/wasm-strip.1.html

Otherwise, I'm sure we are building everything in release mode, LTO, etc. and use only the required amount of INITIAL_MEMORY already. Build backends are not always responsible for stripping away binary information, and while some do support calling strip over the .so files (scikit-build-core, for one), I don't know if they are aware of Emscripten-specific tools for this.

Other ways to reduce binary size at build time (install tags, better support build backend tooling, etc.) are best covered in pyodide-build, not here – we are concerned about what happens after build time.

agriyakhetarpal avatar Mar 11 '25 10:03 agriyakhetarpal

xref https://github.com/pyodide/pyodide/issues/4289

agriyakhetarpal avatar Mar 14 '25 18:03 agriyakhetarpal

FYI, at the time when I investigated this, my impression was that the emscripten options we used was already quite optimal and the few tools I used weren't worth it in terms of gains / complexity. Particularly that in packages most of the size seemed to be due to duplicated data in cython generated .so.

But indeed maybe I missed something or the tooling situation changed in the last few years.

rth avatar Mar 15 '25 16:03 rth