Jeff Daily
Jeff Daily
@pytorchbot merge
@pytorchbot merge
pytorchbot merge didn't close this automatically. Closing manually.
ciflow/rocm is added to this PR. It should provide a ROCm signal now. Thank you @huydhn for the revert and adding of the label.
> Thanks @jeffdaily ! Do you have some benchmark performance number with this PR on FP8 scaled mm? I do not, I was deferring to any benchmarks you were running.
> Hi @jeffdaily , looks like the speedup on FP8 is minor. Is it because hipBLASLt is not well tuned yet? That would be my assumption, but let's check with...
@pytorchbot merge
@andrewor14 How long will it take for #119496 to land? If this PR is ready now and the other is not, I'm in favor of landing this one. Would doing...
@andrewor14 or @albanD, since it sounds decided, can we get an approval so we can merge?
@pytorchbot merge