xla icon indicating copy to clipboard operation
xla copied to clipboard

Allow fusing epilogues whose operands are broadcast of effective-scalar instructions.

Open elfiegg opened this issue 1 year ago • 5 comments

Allow fusing epilogues whose operands are broadcast of effective-scalar instructions. This enables creating fusions for fp8 where the pattern is mul(dot, scalar_ops) where scalar ops's shapes are either [] or [1]. This only affects epilogues, the operands of broadcast will still follow the existing fusing rules. Both triton and cuDNN backends support this kind of fusion.

cc @sergachev

elfiegg avatar Aug 07 '24 04:08 elfiegg