cutlass
cutlass copied to clipboard
Unit tests for Kernels that perform BF16 x BF16 = MXFP8 and MXFP8 x MXFP8 = BF16