hlslpp icon indicating copy to clipboard operation
hlslpp copied to clipboard

Optimize NEON shuffles

Open redorav opened this issue 6 years ago • 0 comments

They're too generic currently and inefficient. We can probably specialize most combinations using constructs such as

vcombine_f32(vget_high_f32(x), vget_low_f32(y)) vrev64q_f32(x)

etc.

redorav avatar Apr 17 '18 13:04 redorav