GPUifyLoops.jl icon indicating copy to clipboard operation
GPUifyLoops.jl copied to clipboard

Fuse multiply-add

Open vchuravy opened this issue 6 years ago • 2 comments

LLVM needs to know that fadd fast so that the MulAdd pass can do it's thing. How do we use fma without making the code ugly.

vchuravy avatar Mar 12 '19 16:03 vchuravy

Add MuladdMacro.jl to their code?

ChrisRackauckas avatar May 07 '19 18:05 ChrisRackauckas

Might be fixed by https://github.com/vchuravy/GPUifyLoops.jl/pull/55

vchuravy avatar May 07 '19 19:05 vchuravy