onnx-mlir
onnx-mlir copied to clipboard
Lower onnx.transpose to memref.transpose for better performance
Given that the current implementation of onnx.transpose by actually shuffling data is expensive, it is better to use memref.transpose
that just doest metadata changes.
This is description of memref.transpose:
let summary = "`transpose` produces a new strided memref (metadata-only)";
let description = [{
The `transpose` op produces a strided memref whose sizes and strides
are a permutation of the original `in` memref. This is purely a metadata
transformation.
Example:
```mlir
%1 = memref.transpose %0 (i, j) -> (j, i) : memref<?x?xf32> to memref<?x?xf32, affine_map<(d0, d1)[s0] -> (d1 * s0 + d0)>>
```
}];
It looks like memref.transpose
will give a better performance.
I assume that this approach works only when the transpose is the only user for the input tenor.