binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

Improving Precompute performance

Open kripken opened this issue 1 year ago • 4 comments

Running on the binary from Google Sheets,

BINARYEN_CORES=1 perf record bin/wasm-opt binary.wasm -all --precompute --no-validation

(1 core to avoid noise from multithreading; validation disabled to focus on the optimization)

The top item from perf report is

     7.41%  wasm-opt  libbinaryen.so        [.] wasm::Type::isRef() const

Reading the code, I can't see an obvious reason for that slowness.

@tlively You mentioned some lock you might look into with wasm::Type - is this related? I don't seem to see a lock taken here though.

My only other guess is that this may just be called many times, and by not being in a header, it isn't getting inlined.

kripken avatar Sep 11 '24 21:09 kripken

The lock shouldn't be related, no. That lock is only taken when creating a new HeapType (edit: or Type). The inlining hypothesis seems likely. Perhaps you could look for places that handle the isRef() case before handling the isBasic() case. isBasic() should be super fast and inlineable, so handling that first should generally be faster.

tlively avatar Sep 11 '24 22:09 tlively

I experimented with moving all the isBasic() checks into the header, to make them fast:

https://github.com/kripken/binaryen/commit/90b6e7a45862cb6db7e04ae6d9ab63637e9d2107?w=1

That does not help, though. Runtime is the same, and the profile shows isRef() replaced with isNonBasicRef(), that is, the majority of the calls seem to go through the slow path anyhow, which is not inlined.

So this still seems like a surprising amount of overhead. @tlively I wonder if we can at least make isRef() super-fast, as it only differentiates references from tuples? Maybe another representation of tuples - which are rare and do not need to be fast - could help the references case.

kripken avatar Sep 11 '24 23:09 kripken

I once tried to change the representation of reference types to be pointers to HeapType infos with their nullability encoded in their low bits, but didn't quite get it working. It would be good to try again. This would also remove the need to take the lock when creating a Type other than a tuple type.

tlively avatar Sep 12 '24 00:09 tlively

I checked the profile here again today and Type::isRef() is thankfully no longer anywhere to be seen. The lowest hanging fruit is now

  • Literal::~Literal() (12.4%)
  • Literal::Literal(const Literal&) (11.9%)
  • Flow::Flow(Literal) (11.6%)

tlively avatar Dec 13 '24 20:12 tlively