minisketch icon indicating copy to clipboard operation
minisketch copied to clipboard

Test performance with -mlzcnt

Open theuni opened this issue 2 years ago • 1 comments

At the moment the __builtin_clz* compile down to bsrq on x86_64. Compiling with -mlzcnt wires up the actual instruction.

CountBits<unsigned long long> without -mlzcnt:

_Z9CountBitsmi:
.LFB189:
    .cfi_startproc
    endbr64
    xorl    %eax, %eax
    testq   %rdi, %rdi
    je  .L1
    bsrq    %rdi, %rdi
    movl    $64, %eax
    xorq    $63, %rdi
    subl    %edi, %eax

CountBits<unsigned long long> with -mlzcnt:

_Z9CountBitsmi:
.LFB189:
    .cfi_startproc
    endbr64
    xorl    %eax, %eax
    testq   %rdi, %rdi
    je  .L1
    movl    $64, %eax
    lzcntq  %rdi, %rdi
    subl    %edi, %eax

I'm unable to test the significance of that because my CPU does not support the instruction. But I assume @sipa would probably know right away whether it's worth bothering.

theuni avatar Mar 31 '23 16:03 theuni

Related to #80.

fanquake avatar Dec 12 '23 10:12 fanquake