Explore the size reduction of Emscripten shared objects
There are a bunch of tools we could run over the object files in wheels for this, but wasm-opt and wasm-dce come up in search results in the first go. I don't know enough about how to do this, so I'm documenting some resources here for our reference.
- https://hacks.mozilla.org/2018/01/shrinking-webassembly-and-javascript-code-sizes-in-emscripten/
- https://github.com/xtuc/webassemblyjs/tree/master/packages/dce
- https://github.com/gonowa/wasm-opt
- https://kripken.github.io/emscripten-site/docs/optimizing/Optimizing-Code.html#code-size
- https://webassembly.github.io/wabt/doc/wasm-strip.1.html
Otherwise, I'm sure we are building everything in release mode, LTO, etc. and use only the required amount of INITIAL_MEMORY already. Build backends are not always responsible for stripping away binary information, and while some do support calling strip over the .so files (scikit-build-core, for one), I don't know if they are aware of Emscripten-specific tools for this.
Other ways to reduce binary size at build time (install tags, better support build backend tooling, etc.) are best covered in pyodide-build, not here – we are concerned about what happens after build time.
xref https://github.com/pyodide/pyodide/issues/4289
FYI, at the time when I investigated this, my impression was that the emscripten options we used was already quite optimal and the few tools I used weren't worth it in terms of gains / complexity. Particularly that in packages most of the size seemed to be due to duplicated data in cython generated .so.
But indeed maybe I missed something or the tooling situation changed in the last few years.