Halide
Halide copied to clipboard
Some SVE2 intrinsics are not enabled due to required interleaving.
See 417d7626dc6fbca970647709fa9314b82014e9a1 .
It sounds like SVE does things the way Hexagon does. There's quite a non-trivial amount of code in HexagonOptimize.cpp that aims to insert the necessary shuffles, and then to simplify them away as much as possible. It could probably be adapted and generalized for use with SVE and Hexagon.
It sounds like SVE does things the way Hexagon does.
It is not as widely used in SVE2. I need to quantify which instructions are involved. (TODO comment is from Steve Suzuki.) I don't think this is as important re: performance as it is for HVX.