wide
wide copied to clipboard
u16x16
This one fits into 256bits, so it can be implemented via AVX.
256-bit support has taken a back seat compared to 128-bit types simply for lack of time to do everything at once, but this would be welcomed as a PR.
@RazrFalcon I might be able to do this, but where is something like this useful? I'm genuinely curious.
@ronniec95 I'm using it in my Skia port: https://github.com/RazrFalcon/tiny-skia/tree/master/src/wide
Right now, I have only a scalar implementation. And I would like to replace the custom SIMD implementation with an existing one, like this crate.
I had a little look at this but was puzzled that the unsigned types seem to use signed add instructions: add_i16_m128i - I can see there's saturating unsigned adds we could use: _mm_adds_epu16. Is it that add_i16_m128i works for both? If so why do they have signed / unsigned in their naming?
for wrapping addition, which is what the integer types in wide
use, signed and unsigned uses the same bit manipulation, and thus the same hardware instruction.
Ah this should not have been closed as this one is not yet implemented.
oops!