binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

Use SmallSet in EffectsAnalyzer

Open Squareys opened this issue 3 years ago • 6 comments

Hi @kripken & @sbc100 !

As promised in #4884, this is split off into it's own PR. For it to have any effect, though, it still depends on avoiding the default constructor allocation, implemented in that other PR.

Benchmarks for the SmallSet template argument following in short.

Best, Jonathan

Squareys avatar Aug 09 '22 14:08 Squareys

Here's the benchmarks (edited for readability), script, input wasm and the executables can be found here

$ ./run-wasm-opt.sh
wasm-opt-baseline:
real    0m23.973s

wasm-opt-no-set-allocation:
real    0m23.209s

wasm-opt-smallset-10:
real    0m18.224s

wasm-opt-smallset-4:
real    0m17.072s

wasm-opt-smallset-2:
real    0m17.087s

wasm-opt-smallset-1:
real    0m17.584s

wasm-opt-smallset-0:
real    0m19.192s
---------------

Here are a couple of culprits:

Where with EffectsAnalyzer, avoiding the default constructor allocation in std::set has a great effect, it seems to cause a 10% pessimization in the rest of the code. This could be because EffectsAnalyzer has a lot of empty sets and usually few elements (which is possibly affirmed by the smaller SmallSet sizes being faster), in which case the std::optional overhead would also barely show an effect.

The rest of the code may be way less effected by that one allocation, if they drown out the improvement there by allocating many times later by exceeding the small size, and also the optional overhead might weigh in way more here.

Please also note that this benchmark is heavily biased to the way Wonderland Engine writes code (which is a non-standard flavor over heavily optimized data-oriented C++) and probably not representative of "the average C++ wasm". You may want to run the benchmarks with other .wasm input to test.

All in all, @kripken, I'm considering wheter it wouldn't make sense to revert the changes to SmallSet and just use std::optional here in EffectsAnalyzer?

Squareys avatar Aug 09 '22 15:08 Squareys

Thanks!

Ok, I did some benchmarking on a UE4 game, a Dart benchmark, and Poppler, all on Linux. Raw times are noisy but the instruction counts from perf stat are pretty robust. Numbers are in K of instructions:

baseline SmallSet<10> in effects.h SmallSet<1> SmallSet<0> just SmallSet, no effects.h changes
Dart 27,193 26,793 26,878 27,232 27,088
Poppler 16,262 15,836 15,812 16,246 16,275
UE4 495,164 479,595 480,944 496,245 495,671

(runtimes are generally similar to these numbers).

Looks like SmallSet<0> and just changing SmallSet (no changes to effects.h) make almost no difference compared to baseline. 1 and 10 do make a difference. I'd say 1 is the safer option since it is the smallest memory size increase.

I also lean towards changing SmallSet. While the change by itself doesn't change much, as we use SmallSet more it will help more, and will avoid us needing to add std::optional in more places. And I do think it makes sense to do, the more I think on it: SmallSet assumes we will rarely need the full set, and using std::optional there makes us just have a pointer for it, rather than its full contents (which will be at least a pointer + a size).

So I suggest we land #4884 and then land this but with 1 instead of 10. What do you think?

kripken avatar Aug 09 '22 17:08 kripken

@kripken Sounds great! I updated this PR to use 1 instead of 10!

Squareys avatar Aug 09 '22 17:08 Squareys

Rebased onto the changes to #4884 !

Squareys avatar Sep 28 '22 13:09 Squareys

Rebased onto the changes to https://github.com/WebAssembly/binaryen/pull/4884 once again, which should also fix the test case that was failing over there.

Also noticed I had apparently missed two sets that were still on 10 instead of 1 element. Hope that wasn't just my faulty memory and it was intended to keep those different for some reason.

Squareys avatar Nov 23 '22 11:11 Squareys

Rebased onto the changes to https://github.com/WebAssembly/binaryen/pull/4884 !

Squareys avatar Mar 29 '23 21:03 Squareys