packed_simd icon indicating copy to clipboard operation
packed_simd copied to clipboard

Optimize u8x8::trailing_zeros for AArch64

Open TheIronBorn opened this issue 7 years ago • 0 comments

LLVM's cttz.v8i8 intrinsic is broken on AArch64 machines: https://github.com/rust-lang-nursery/packed_simd/issues/191

Our current workaround just applies u8::trailing_zeros to each lane. With 8 lanes, that can be quite slow.

It could be optimized by adapting LLVM's algorithm to Rust's AArch64 SIMD intrinsics (some may be missing and we would have to implement those as well: https://github.com/rust-lang-nursery/stdsimd/issues/40).

TheIronBorn avatar Nov 26 '18 02:11 TheIronBorn