exception-handling
exception-handling copied to clipboard
Measure size increase for enabling EH on modern C++ codebases
This is a request from people working on the producer side to get some real-world measurements of the size increase of enabling exception handling in large C++ codebases, particularly those that heavily use the common RAII style which will end up giving a large percentage of functions one or more catch_all
blocks.
I'm not sure what the criteria is for what's an acceptable increase, but if it's significant then we should probably reconsider some of the strategies that would allow sharing code between the normal and unwinding exit paths. It won't feel great if we do all this work to add EH to wasm and the general advice immediately becomes "don't enable EH".
Good question. Additionally, I'd like to understand how much of this size is metadata versus code.
Hah, I had this same question again today and when I did a quick search to see if anyone had already asked/answered this question I found this old issue :) Just thought I'd bump this b/c I'm still quite interested to see any measurements here. Thanks!
We made end-to-end support work for small tests and are starting to test larger programs now, so we don't have reliable data on this yet. I think they will be available in a few to several months (depending on the size of programs), and I'll update this issue on that.
We don't have results from a complete benchmark suite or something at this point, but now we have some data points.
Note that some amount of code size increase is inevitable because there is simply more code to compile. -fignore-exceptions
, the baseline, compiles away all landing pads, including catch
clauses and destructor calls for stack objects when an exception is thrown.
We compiled Binaryen using Wasm and compared code size increases with the baseline (-fignore-exceptions
). The LTO version of emcc -Oz
showed a 9.1% increase in code size of wasm-opt.wasm
, the main code optimizer in Binaryen. The old Emscripten-style exception handling showed 42.8% increase in the same setup. So 9.1% vs. 42.8% is a clear win for code size using Wasm EH. Other tools in Binaryen showed similar numbers. Non-LTO version also showed a big difference:
Platform / Build | Code Size Increase |
---|---|
Wasm EH, emcc -Oz (latest), non-LTO | 13.7% |
Wasm EH, emcc -Oz (latest), LTO | 9.1% |
Emscripten EH, emcc -Oz (latest), non-LTO | 58.9% |
Emscripten EH, emcc -Oz (latest), LTO | 42.8% |
We also checked how native platforms perform for the same wasm-opt
binary. All code size increase are the ratio compared to the baseline (-fignore-exceptions
). Binaries are statically linked as in the Wasm binaries.
Platform / Build | Code Size Increase |
---|---|
Linux (x86), clang -Oz (12.0.1), non-LTO | 5.7% |
Linux (x86), clang -Oz (12.0.1), LTO | 7.1% |
Windows (x86), clang -Oz (latest), non-LTO | 20.0% |
Windows (x86), clang -Oz (latest), LTO | 22.0% |
Wasm EH's numbers are still a little higher than Linux x86’s, but the code size highly depends on how many optimizations the toolchain can do, and given that we have spent very little time on optimizing EH code in our toolchain so far, I think we have still have much room for improvement. But the bottom line is that the current numbers already show the advantage of using the new proposal over the old Emscripten-style EH.
Most (about 97.5%) of the code size increases comes from the code section. The rest 2.5% is from increase from the data section, which contains LSDA info (= exception table for each function that has landingpads).
We also asked a partner we are working with, who has a large cross-platform application they are porting to the web. Their application uses exceptions heavily and is much bigger than Binaryen. With -O2
and non-LTO builds, they reported a 13.4% code size increase. It looks like they haven't tried LTO build yet so they may benefit from that more. We asked them to also build with Emscripten EH but there was some problem so they couldn't build it, but we can say the size was much larger.
@aheejin, thank you! If I read this correctly, "code size" for wasm is uncompressed wasm bytecode size. I agree that 10-15% increase for bytecode that uses exception handling pervasively (as would C++) is reasonable, especially as this is the ceiling.
Also interesting to me would be the changes in the size of machine code and in-memory metadata in V8, as this can matter just as much as bytecode size, especially on mobile and in other size-constrained environments. Once we have an implementation in SpiderMonkey we'll collect and publish these numbers too, but in the mean time any numbers from V8 would be welcome.
Hi, I just came across this thread. I am working on a project called OpenCascade.js, which is converting the rather large OpenCascade project to WebAssembly using Emscripten. The current beta version of OpenCascade.js has ~47 MB (without exception support). If there is an easy way to test Wasm EH using some Emscripten compiler / linker flags (which ones?) - and if it's interesting for you, I would be happy to run some builds and compare the results. It might also be interesting to look at performance - I have experienced huge performance drops with Emscripten's implementation and would be curious if those go away with Wasm EH.
Emscripten's docs now cover enabling exceptions in emscripten. To test with Chrome, you can go to chrome://flags and turn on #enable-experimental-webassembly-features
Thanks a lot for pointing me at the docs! Didn't know (and expect) that this was already documented.
I made the following (custom) builds of the library
Those two builds are custom builds of the library with a reduced feature set (hence the smaller file size than previously mentioned). Unfortunately, I wasn't able to create a full build of the library since I was running into an LLVM error - I might file an issue with them about this.
The difference in file size between Wasm EH and Emscripten EH is (main JS + WASM + worker JS):
- Disabled EH: 16,337,122 bytes
- Emscripten EH: 23,203,693 bytes (+42% vs disabled EH)
- Wasm EH: 20,221,475 bytes (+24% vs disabled EH)
Difference Wasm EH vs Emscripten EH: -2,982,218 bytes => ~13% smaller.
For performance testing, I tried loading a couple of different 3D models in the STEP file format (~10MB, ~100MB and ~500MB) using this prototype of a 3D viewer. ~~In terms of performance I wasn't able to notice any difference between disabled EH, full Emscipten EH, full Wasm EH and "selective Emscripten EH" (i.e. using -sEXCEPTION_CATCHING_ALLOWED=[...]
only where I know that I need it). Last time I investigated Exceptions, I had a 5 times increase in load times when using "full Emscripten EH" vs "selective Emscripten EH". So that's definitely a good thing, but might also be related to some different optimization settings I am using right now (LTO?!).~~ (This is wrong, updated test results below)
The results are great! Can't wait for this to arrive in all major runtimes.
Thanks, please do file a bug (if the LLVM bug tracker gives you trouble, you can file it at https://github.com/emscripten-core/emscripten/issues). I would find it quite surprising if you don't see much difference between "full" emscripten EH and disabled EH. Although I guess it is plausible that LTO could determine that no exceptions can be thrown on many codepaths, and remove the EH code around many calls.
I just updated my previous post with the results of the base line build - with disabled EH.
I also re-checked the performance results: Turns out that NextJS did some caching magic and I was running all my tests with the same build :expressionless:. Here are the updated and corrected performance test results, measured with a manual stopwatch with an accuracy of ~2 seconds or so. The table shows total loading times in seconds for different models.
Panzerkampfwagen:
DS60 - I-Temp
1100 GSXR Streetbike
Panzerkampfwagen (17MB) | DS60 - I-Temp - Step (100.5MB) | 1100 GSXR Streetbike (440MB) | |
---|---|---|---|
0 Disabled EH | 49 | uncaught exception | uncaught exception |
1 Emscripten EH | 98 | 233 | 1394 |
2 Wasm EH | Aw, Snap! | Aw, Snap! | Aw, Snap! |
3 Selective Emscripten EH | 48 | 88 | 455 |
Wasm EH doesn't seem to work in my case. Tested with chrome 92.0.4515.159 with #enable-experimental-webassembly-features
enabled.
Just had another idea: The Wasm EH version does work without problems (and seemingly fast) when I'm having Chrome's DevTools open during the execution. Since V8's TurboFan compiler is disabled when DevTools are open, this might be an indication that the problem lies there. I am going to file a bug report about this over at bugs.chromium.org (and link it here).
@donalffons Thanks for your experiments! So apparently there are two problems:
One is the compilation error with LLVM when you are building the library
Unfortunately, I wasn't able to create a full build of the library since I was running into an LLVM error - I might file an issue with them about this.
and the other is a runtime error?
Just had another idea: The Wasm EH version does work without problems (and seemingly fast) when I'm having Chrome's DevTools open during the execution. Since V8's TurboFan compiler is disabled when DevTools are open, this might be an indication that the problem lies there.
For the compilation error, I'd appreciate if you let us know about it at https://github.com/emscripten-core/emscripten/issues. It'd be also great if you CC me there. You can also file a LLVM bug report at https://bugs.llvm.org/, but I think the former is easier to use if you haven't used the LLVM Bugzilla or don't have an account there.
About the runtime error on TurboFan, you can file a bug at https://bugs.chromium.org/p/v8. Please select "WebAssembly" for the component, and you can paste the link here or CC my Chromium account: [email protected]
@aheejin, here is a link to the issue on the V8 tracker: https://bugs.chromium.org/p/v8/issues/detail?id=12255 (with an example deployment)
About the other (LLVM related) issue: I'm still looking into that. Somehow, I wasn't able to reproduce to reproduce that today - so maybe that was a fluke?! I'll let you know if I find something.
C++ EH is a mistake. Herbception please https://www.youtube.com/watch?v=ARYP83yNAWk