binaryen
binaryen copied to clipboard
wasm-emscripten-finalize runs out of memory with -g option
I'm attempting to build a large project and debug it, but when building, the wasm-emscripten-finalize operation takes 18-20 Gigabytes of RAM to complete. In looking at the code, I would guess that this is due to the metadata being stored in a single string here: https://github.com/WebAssembly/binaryen/blob/be02d3f0f2689475f31c4523010eed58f39d27cb/src/tools/wasm-emscripten-finalize.cpp#L302
seems like the EmscriptenGlueGenerator could be re-worked to stream that string as it's built so that it doesn't have to be saved in memory and then written out at the end.
Another possibility might be that this is due to DWARF rewriting. That's been my experience. To test this, you can strip debug info from the linker inputs.
Otherwise, it would be good to confirm what is causing the increase. Adding some code to measure memory usage before and after the line you suspect, for example.
@kripken Any news on memory use of wasm-emscripten-finalize
? We're hitting this problem in builds of arrow, and it is a pain having to do everything with CMAKE_BUILD_PARALLEL_LEVEL
set to 1.
@joemarshall If you provide a testcase I can profile memory usage there. Otherwise the only known issue is DWARF, as mentioned above, which has a simple workaround (don't compile source files with DWARF, or strip DWARF during link), and doesn't have a good solution (since DWARF is often very large, and complex to represent - so it is expected that a lot of memory would be used).
Looking at it with valgrind --tool=massif
on a snapshot of a medium sized build (a test fixture for apache arrow)
1.1GB in mixedarena, which is static after an initial load. This appears to be the wasm binary data itself.
309mb of wasm strings.
A growing amount of data in hashtables (a bit under 2GB).
The big hashtables appear to be all DWARF stuff as you say. I wonder if it would be worth using a faster + more memory efficient hashtable implementation here - e.g. https://github.com/greg7mdp/parallel-hashmap
Even if it only halved the amount of memory used, it would be a big win, and would have potentially big performance improvements for anyone wanting to have full debug info included.
Allocations are mainly at: https://github.com/WebAssembly/binaryen/blob/9d0740c8d84f822567bb4d08784238dd5a89b43f/src/wasm/wasm-binary.cpp#L4195 and https://github.com/WebAssembly/binaryen/blob/9d0740c8d84f822567bb4d08784238dd5a89b43f/src/wasm/wasm-stack.cpp#L2493
Interesting data, thanks. Good to confirm it is mostly DWARF, as we suspected.
(The 1.1GB in MixedArena is the wasm binary, yeah, and that's very heavily optimized already for size. 1.1GB is a lot, but I imagine the wasm binary is fairly large?)
Investigating a better hashmap definitely sounds like a worthwhile direction.
For anyone else experiencing this, -gsplit-dwarf
seems to help a lot, although it would be nice to reduce things further because I'm still unable to run with 2 or more link jobs even so.
Looking at the size of the wasm binaries, they're approx 300mb, so 1.1gb is probably not surprising there.