hibitset SIMD optimization of iteration

Would be interesting to look at SIMD for optimizing the BitIter. Unsure if it would be worth it with some extensions like AVX512 due to it generally slowing down the cpu frequency, but SSE2 is probably worth exploring and benchmarking.

Some resources for that: https://doc.rust-lang.org/stable/std/macro.is_x86_feature_detected.html https://github.com/AdamNiederer/faster

Sep 28 '18 06:09 Aceeri

You mean that we would provide simd_iter version of BitIter, which would construct a vectors of indices that are set, so SIMD operations can be done on them?

Sep 28 '18 08:09 WaDelma

Yes, or even modifying the BitIter implementation (falling back to the current if no SSE2 exists on the system).

I think modifying the current implementation would be better, of course we should only actually merge this if it does mean that we get performance gains, especially in larger examples, since some SIMD instructions tend to lower cpu clock frequency (AVX512) which is not good for performance.

Sep 28 '18 09:09 Aceeri

But iterator and simd_iterator have completely different APIs, so I don't see how modifying BitIter would be an option.

Sep 28 '18 10:09 WaDelma

Oh I didn't mean implementing simd_iterator, I just mean having some more state in BitIter that allows for multiple index processing in one iteration (which we then use for the next iteration) or something similar.

Sep 28 '18 20:09 Aceeri