ParallelStencil.jl icon indicating copy to clipboard operation
ParallelStencil.jl copied to clipboard

Include `@tturbo` as loop vectorisation possibility for the CPU backend

Open luraess opened this issue 4 years ago • 1 comments

Something to consider as alternative or supplement to the current Threads.@threads option. The @tturbo macro allows for threaded aux instruction exposed by the LoopVectorization package. See here https://github.com/luraess/parallel-gpu-workshop-JuliaCon21#parallel-cpu-implementation for an example. There may be some restrictions on handling if conditions inside the loop.

luraess avatar Jul 08 '21 21:07 luraess

reopened as foreseen GPU optimizations should also make the usage of LoopVectorization feasible without or little approach divergence between CPU and GPU code generation

omlins avatar Jul 29 '21 11:07 omlins