tpp-mlir Remaining Issues for MLP performance on par with libxsmm-dnn

Remaining Issues for MLP performance on par with libxsmm-dnn

Open rengolin opened this issue 1 year ago • 1 comments

These are the known issues to reach libxsmm-dnn performance on "pre-packed layer" MLPs:

[x] Beta=Zero (see #777, #784)
[x] XSMM fusion (see #752)
[ ] Allocation on page boundary (2MB)?
[ ] Change loop order with flags?

In theory, if we get all of those in, it should reach parity. If more is discovered, please add to the list. Let's only close this issue when we reach parity on the base MLP benchmarks we have for pre-packed MLPs.

@chelini @alheinecke

Nov 13 '23 17:11 rengolin

Beta=0 is done, benchmark IR is affected, but we got <1% performance change from that, probably within noise. We didn't expect a huge change, so not a big deal.

Nov 20 '23 14:11 rengolin

tpp-mlir tpp-mlir copied to clipboard

Remaining Issues for MLP performance on par with libxsmm-dnn

tpp-mlir
tpp-mlir copied to clipboard