triton [AMD] ElementwiseOpToLLVM: Do not convert types if they are equal

This commit fixes failure in python/tutorials/03-matrix-multiplication.py for FMA cases.

Feb 07 '24 14:02 joviliast

Essentially, this patch enables triton::FPToFP operation to cast fp16 to fp32 and back, @joviliast is it correct?

@ptillet Do we want FPToFP operation to be able to convert any float types?

Feb 07 '24 15:02 binarman

Essentially, this patch enables triton::FPToFP operation to cast fp16 to fp32 and back, @joviliast is it correct?

This patch disables casting. Did you mean "pass"?

Feb 07 '24 15:02 joviliast

This patch disables casting. Did you mean "pass"?

It disables only "intermediate" casting, the rest is in place:

https://github.com/openai/triton/pull/3091/files#diff-c1b7645c56652fe811ae807f37bbea185af4221b1122fa095fbd43b80266182eR1686
https://github.com/openai/triton/pull/3091/files#diff-c1b7645c56652fe811ae807f37bbea185af4221b1122fa095fbd43b80266182eR1700

Feb 07 '24 15:02 binarman

@binarman in principle yes, in practice there are many subtleties related to FP8 and rounding modes. Even our Nvidia backend isn't handling this super well today

Feb 08 '24 19:02 ptillet

@joviliast @binarman Can you elaborate on the motivation of this PR? Why does converting between the same type cause FMA failures?

Feb 13 '24 14:02 zhanglx13

@joviliast @binarman Can you elaborate on the motivation of this PR? Why does converting between the same type cause FMA failures?

Because the case of the same internal types just not supported and it fails on compile time as unsupported conversion.

Feb 22 '24 14:02 joviliast

@joviliast @binarman Can you elaborate on the motivation of this PR? Why does converting between the same type cause FMA failures?

Because the case of the same internal types just not supported and it fails on compile time as unsupported conversion.

can you make a minimized lit test out of this failure. That will help everybody understand (and will be prevent regressions)

Feb 22 '24 16:02 ThomasRaoux

can you make a minimized lit test out of this failure. That will help everybody understand (and will be prevent regressions)

@ThomasRaoux , done. Added commit containing lit test for fixed case. It can be ran after merging https://github.com/openai/triton/pull/3254

Mar 05 '24 11:03 joviliast

lit tests under amd dir is not tested

Thanks for catching that. Is someone working on fixing this?

Mar 06 '24 17:03 ThomasRaoux

lit tests under amd dir is not tested

I don't quite catch your point.

I can see it passed in CI logs:

screen

Mar 06 '24 17:03 joviliast

triton triton copied to clipboard

[AMD] ElementwiseOpToLLVM: Do not convert types if they are equal

triton
triton copied to clipboard