packed_simd
packed_simd copied to clipboard
Investigate using llvm.bswap for vertical byte_swap
Currently we use shuffles for implementing vertical byte_swap but llvm.bswap works on vectors as well. We should investigate which method generates better code, use it, and fill LLVM bugs for the other (they should generate identical code).