triton icon indicating copy to clipboard operation
triton copied to clipboard

[AMD] ElementwiseOpToLLVM: Do not convert types if they are equal

Open joviliast opened this issue 1 year ago • 5 comments

This commit fixes failure in python/tutorials/03-matrix-multiplication.py for FMA cases.

joviliast avatar Feb 07 '24 14:02 joviliast

Essentially, this patch enables triton::FPToFP operation to cast fp16 to fp32 and back, @joviliast is it correct?

@ptillet Do we want FPToFP operation to be able to convert any float types?

binarman avatar Feb 07 '24 15:02 binarman

Essentially, this patch enables triton::FPToFP operation to cast fp16 to fp32 and back, @joviliast is it correct?

This patch disables casting. Did you mean "pass"?

joviliast avatar Feb 07 '24 15:02 joviliast

This patch disables casting. Did you mean "pass"?

It disables only "intermediate" casting, the rest is in place:

  • https://github.com/openai/triton/pull/3091/files#diff-c1b7645c56652fe811ae807f37bbea185af4221b1122fa095fbd43b80266182eR1686
  • https://github.com/openai/triton/pull/3091/files#diff-c1b7645c56652fe811ae807f37bbea185af4221b1122fa095fbd43b80266182eR1700

binarman avatar Feb 07 '24 15:02 binarman

@binarman in principle yes, in practice there are many subtleties related to FP8 and rounding modes. Even our Nvidia backend isn't handling this super well today

ptillet avatar Feb 08 '24 19:02 ptillet

@joviliast @binarman Can you elaborate on the motivation of this PR? Why does converting between the same type cause FMA failures?

zhanglx13 avatar Feb 13 '24 14:02 zhanglx13

@joviliast @binarman Can you elaborate on the motivation of this PR? Why does converting between the same type cause FMA failures?

Because the case of the same internal types just not supported and it fails on compile time as unsupported conversion.

joviliast avatar Feb 22 '24 14:02 joviliast

@joviliast @binarman Can you elaborate on the motivation of this PR? Why does converting between the same type cause FMA failures?

Because the case of the same internal types just not supported and it fails on compile time as unsupported conversion.

can you make a minimized lit test out of this failure. That will help everybody understand (and will be prevent regressions)

ThomasRaoux avatar Feb 22 '24 16:02 ThomasRaoux

can you make a minimized lit test out of this failure. That will help everybody understand (and will be prevent regressions)

@ThomasRaoux , done. Added commit containing lit test for fixed case. It can be ran after merging https://github.com/openai/triton/pull/3254

joviliast avatar Mar 05 '24 11:03 joviliast

lit tests under amd dir is not tested

Thanks for catching that. Is someone working on fixing this?

ThomasRaoux avatar Mar 06 '24 17:03 ThomasRaoux

lit tests under amd dir is not tested

I don't quite catch your point.

I can see it passed in CI logs:

screen

joviliast avatar Mar 06 '24 17:03 joviliast