hibitset icon indicating copy to clipboard operation
hibitset copied to clipboard

SIMD optimization of iteration

Open Aceeri opened this issue 7 years ago • 4 comments

Would be interesting to look at SIMD for optimizing the BitIter. Unsure if it would be worth it with some extensions like AVX512 due to it generally slowing down the cpu frequency, but SSE2 is probably worth exploring and benchmarking.

Some resources for that: https://doc.rust-lang.org/stable/std/macro.is_x86_feature_detected.html https://github.com/AdamNiederer/faster

Aceeri avatar Sep 28 '18 06:09 Aceeri

You mean that we would provide simd_iter version of BitIter, which would construct a vectors of indices that are set, so SIMD operations can be done on them?

WaDelma avatar Sep 28 '18 08:09 WaDelma

Yes, or even modifying the BitIter implementation (falling back to the current if no SSE2 exists on the system).

I think modifying the current implementation would be better, of course we should only actually merge this if it does mean that we get performance gains, especially in larger examples, since some SIMD instructions tend to lower cpu clock frequency (AVX512) which is not good for performance.

Aceeri avatar Sep 28 '18 09:09 Aceeri

But iterator and simd_iterator have completely different APIs, so I don't see how modifying BitIter would be an option.

WaDelma avatar Sep 28 '18 10:09 WaDelma

Oh I didn't mean implementing simd_iterator, I just mean having some more state in BitIter that allows for multiple index processing in one iteration (which we then use for the next iteration) or something similar.

Aceeri avatar Sep 28 '18 20:09 Aceeri