xsimd
xsimd copied to clipboard
Add a fixed size batch
As someone who has worked with Vc and std::experimental::simd
before, I miss the equivalent of a fixed size SIMD array in xsimd. This would be a version of xsimd::batch
where I can specify the length at compile time and this length is then subdivided and mapped to the available SIMD register sizes. In Vc and std::experimental::simd
, this is solved with an additional ABI tag. The equivalence in xsimd would be the architecture type argument A
in xsimd::batch<T, A>
.
As an example, a xsimd::batch<float, fixed_size<12>>
may use 3 SSE registers internally when compiled for SSE, or 1 AVX and 1 SSE register when compiled with AVX. A xsimd::batch<float, fixed_size<5>>
would use an SSE register and a scalar. Etc.
Such a construct is very handy when the algorithm dictates a vector length. Or when I need to mix element types of different sizes, e.g. a loop over arrays with float and double. In the latter case I could use xsimd::batch<float>
and xsimd::batch<double, fixed_size<xsimd::batch<float>::size>>
to ensure that both batches have the same number of elements.
Would it be possible to add such a construct? Thank you!
Some random thoughts: this would slightly overlap with xtensor, we probably need a dedicated type to achieve that, we already have support for fixed size array but it's very limited (only works if the size directly maps to a supported register size)