triton
triton copied to clipboard
Implement scaled_dot(mxfp8, fp8) via mma
Initial implementation using mma.
Missing to test that it plays ball with the pipeliner.
So this breaks the pipeliner somehow?
No, it's just that I haven't tested if it works as expected just yet.
BTW, it would be nice to have lit tests for accelerate_matmul as well lowering to of upscale to llvm
@ThomasRaoux addressed the review (at long last)