rgbds icon indicating copy to clipboard operation
rgbds copied to clipboard

[Feature request] Count leading/trailing zeros and population count

Open aaaaaa123456789 opened this issue 5 years ago • 10 comments

Having built-in functions to count the number of leading and trailing zeros in a number, as well as the number of set bits in it (population count), would be very useful, as this is something that is necessary every now and then and there's no way to do it without a loop.

Example:

  db CLZ(42), CTZ(42), POPCNT(42)

...would output 26, 1, 3. (The leading zero count is based on 32-bit numbers. This is easy to adjust to narrower widths by subtraction; the function doesn't need to worry about guessing the correct width.)

aaaaaa123456789 avatar Jun 21 '20 10:06 aaaaaa123456789

What is it useful for?

ISSOtm avatar Jun 21 '20 10:06 ISSOtm

To give an example I just came across, this ugly macro here: https://github.com/pret/pokecrystal/blob/b50dd57cbb8e163cd269255c8f754e5631760a03/macros/code.asm#L26-L48

That could be trivially rewritten as:

maskbits: MACRO
  if _NARG > 1
sh = \2
  else
sh = 0
  endc
  and (1 << (32 - CLZ((\1) - 1)) - 1) << sh
ENDM

aaaaaa123456789 avatar Jun 21 '20 10:06 aaaaaa123456789

I don't think this is appropriate as a built-in function, then.

ISSOtm avatar Jun 21 '20 11:06 ISSOtm

That case is a macro, but this would often be useful inline. And the truth is that there's no way to do this inline without invoking some sort of looping macro. The little example I posted in the chat is a great case of an inline use:

  if FOO & (FOO - 1)
    ; FOO is not a power of two
    ld a, c
    cp FOO
  else
    bit CTZ(FOO), c
  endc

There's simply no way of achieving that behavior right now, short of hiding the entire comparison code in a looping macro. And that's bad for several reasons.

aaaaaa123456789 avatar Jun 21 '20 12:06 aaaaaa123456789

With user-defined functions (#201) and a ternary operator (#621) (and STRFMT (#178) for terse CLZ and CTZ), these could be implemented as such (with adjustable width for CLZ and CTZ instead of assuming 32-bit):

DEF POPCNT(n) := n < 2 ? n : n % 2 + POPCNT(n / 2)
DEF CLZ(n) := n == 0 ? 32 : 32 - STRLEN(STRFMT("%b", n))
DEF CTZ(n) := n == 0 ? 32 : STRLEN(STRFMT("%b", n)) - STRRIN(STRFMT("%b", n), "1")
; closed-form expression from https://en.wikipedia.org/wiki/Bit_manipulation_instruction_set
DEF CTZ(n) := 31 + (!x) \
  - (((x & -x) & $0000FFFF) ? 16 : 0) \
  - (((x & -x) & $00FF00FF) ? 8 : 0) \
  - (((x & -x) & $0F0F0F0F) ? 4 : 0) \
  - (((x & -x) & $33333333) ? 2 : 0) \
  - (((x & -x) & $55555555) ? 1 : 0)

Rangi42 avatar Dec 14 '20 00:12 Rangi42

Also useful for counting the bits needed to store a number:

DEF CEIL_LOG2(n) := n == 0 ? 0 : STRLEN(STRFMT("%b", n))
DEF CLZ(n) := 32 - CEIL_LOG2(n)

Rangi42 avatar Dec 14 '20 00:12 Rangi42

Converting a number to string just to count its digits is one of the oldest bad programming memes around... That being said, I wonder how much it'd slow down a build that relied on these functions heavily.

aaaaaa123456789 avatar Dec 14 '20 00:12 aaaaaa123456789

I'm not sure you'd need string conversion, either, but that might not be faster. Then again, we will see about that once #201 is implemented.

ISSOtm avatar Dec 14 '20 11:12 ISSOtm

You could also do:

def POPCNT(x) = x ? 1 + POPCNT(x & (x - 1)) : 0
def  TZCNT(x) = x ? x % 2 ? 0 : 1 + TZCNT(x / 2) : 0
def  LZCNT(x) = x ? (x & 1<<31) ? 0 : 1 + LZCNT(x << 1) : 0

Rangi42 avatar Dec 14 '20 14:12 Rangi42

Macro Assembler AS has these as BITCNT, FIRSTBIT, and LASTBIT.

Rangi42 avatar Feb 01 '21 19:02 Rangi42

Assuming user-defined functions will support recursion, and that we implement a short-circuiting ternary operator, these can be defined as follows:

DEF popcnt(x) := x ? (1 + popcnt(x & (x - 1))) : 0
DEF ctz(x) := x ? ((x &        1 ) ? 0 : (1 + ctz(x >> 1))) : 0
DEF clz(x) := x ? ((x & (1 << 31)) ? 0 : (1 + clz(x << 1))) : 0

Rangi42 avatar Sep 28 '22 18:09 Rangi42