MIPP
MIPP copied to clipboard
Lzcnt/popcnt functions?
Hi,
Unless I haven't looked in the right places, there doesn't seem to be any support lane-wise lzcnt or popcount. Is it foreseen in the future or out of scope of this library?
Thanks in advance
Hi @touisteur,
At this time this feature does not exist. Maybe you can use mipp::sum
as a work around until I implement the popcount
.
Hope it helps.
a look at https://github.com/kimwalisch/libpopcnt might be interesting - although i would think, there are different goals:
- libpopcnt is targeted for big data sizes
- mipp IMHO should target inlined code for simple data types
@kouchy : don't see how mipp::sum
could be any workaround?!
@hayguen I guess, you would use LUT to convert each byte/nibble to a number of set bits in it, then shuffle/permute and sum till you reach your lane width. Some libraries even do it recursively and optimizer unwinds it quite nicely.
@yzazik i had found my non-simd solution at https://graphics.stanford.edu/~seander/bithacks.html specifically https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel , if i remind right benchmarks showed, that it's very near to CPU intrinsic on x64