wide
wide copied to clipboard
Combination of `i8x16::move_mask()` and `.trailing_zeros()` is a footgun
For example, consider this code, extracted from one of my libraries:
for &group in samples.array_chunks::<16>() {
let mask = predicate(i8x16::new(group));
offset += mask.move_mask().trailing_zeros() as usize;
if mask.any() { break }
}
offset
The intent here is to move offset
by the fractional part of the group that matched the predicate. Unfortunately, if the predicate didn't match at all, the moved mask would have 32 zeroes, throwing off the pointer.
(I know about iter.remainder()
--in this particular application it can't be used because by the time the remainder is examined, it is already advanced past the group being matched by the predicate.)
Making i8x16::move_mask()
be an u16
would solve this problem, but won't work for i32x4
. Adding leading_zeros()
and trailing_zeros()
to the SIMD types, or perhaps boolean mask types from #43, would also solve it.