MIPP icon indicating copy to clipboard operation
MIPP copied to clipboard

Lzcnt/popcnt functions?

Open touisteur opened this issue 3 years ago • 4 comments

Hi,

Unless I haven't looked in the right places, there doesn't seem to be any support lane-wise lzcnt or popcount. Is it foreseen in the future or out of scope of this library?

Thanks in advance

touisteur avatar Aug 27 '21 02:08 touisteur

Hi @touisteur,

At this time this feature does not exist. Maybe you can use mipp::sum as a work around until I implement the popcount.

Hope it helps.

kouchy avatar Oct 18 '21 15:10 kouchy

a look at https://github.com/kimwalisch/libpopcnt might be interesting - although i would think, there are different goals:

  • libpopcnt is targeted for big data sizes
  • mipp IMHO should target inlined code for simple data types

@kouchy : don't see how mipp::sum could be any workaround?!

hayguen avatar Oct 22 '21 21:10 hayguen

@hayguen I guess, you would use LUT to convert each byte/nibble to a number of set bits in it, then shuffle/permute and sum till you reach your lane width. Some libraries even do it recursively and optimizer unwinds it quite nicely.

yzazik avatar Oct 24 '21 09:10 yzazik

@yzazik i had found my non-simd solution at https://graphics.stanford.edu/~seander/bithacks.html specifically https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel , if i remind right benchmarks showed, that it's very near to CPU intrinsic on x64

hayguen avatar Oct 24 '21 11:10 hayguen