tpp-mlir icon indicating copy to clipboard operation
tpp-mlir copied to clipboard

Support lowering of vector.contract to amx for brgemm

Open shahidact opened this issue 1 year ago • 0 comments

Fp32 brgemm can be lowered using FMAs but this can not be used for
BF16 inputs.

Intel AMX has TMUL functional unit which provides tile registers
of size 16x16 for bf16 data type and corresponding load, store,
multiply instructions. This pass lowers the tiled brgemm from
vector dialect to AMX dialect which subsequently gets lowered to
AMX instructions.

shahidact avatar Feb 28 '25 05:02 shahidact