triton icon indicating copy to clipboard operation
triton copied to clipboard

[AMD] Support FP8E5M2 with MFMA FP16 instructions

Open binarman opened this issue 7 months ago • 1 comments

Cast dot arguments from unsupported FP8 to supported FP16 in order to use MFMA instructions instead of FMA. This approach is expected to give better performance and be more stable compared to FMA implementation.

binarman avatar Jul 04 '24 21:07 binarman