xsimd
xsimd copied to clipboard
Use _mm256_cvttps_epi32 instead of _mm256_cvtps_epi32 everywhere
cvttps uses truncation and
3/5 == 0 becomes true which seems to be the default behavior in C++