wide
wide copied to clipboard
A crate to help you go wide. By which I mean use SIMD stuff.
The only hardware move mask instruction is for `i8`, so it's unclear what we should do for the other integer types. They could just not support move mask at all,...
https://kojipkgs.fedoraproject.org//work/tasks/4780/71404780/build.log It is possible that the problem is with rust or the fact that it is a big endianness arch, but you should better know it then me. failures: ----...
Seems like a fine addition to the library, I been using a lot of my time in here even though stdsimd/portable-simd will happen someday lol. Do you have any recommendations...
This is the code in question: https://github.com/Lokathor/wide/blob/b2cfe07d7ef059b5f3e4535b3f0fbd428c3ff1c5/src/f64x2_.rs#L290-L292 None of the other impls have this, so there's at least something weird going on here.
There's no Ceil or Floor method implemented for the wide types. Based off of Issue #15, it appears that only Intel has intrinsics for floor/ceil, and other architectures will need...
It would be great to have a documentation, especially Intel-like: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=3333,100,100&text=_mm512_add_epi32 ``` FOR j := 0 to 15 i := j*32 dst[i+31:i] := a[i+31:i] + b[i+31:i] ENDFOR dst[MAX:512] := 0...
An implementation of this would be really useful for monte carlo simulations (pricing), and anywhere where 4 or 8 random independent randoms are needed. https://github.com/lemire/simdpcg or https://mathoverflow.net/questions/104915/pseudo-random-algorithm-allowing-o1-computation-of-nth-element I'm not sure...