libdivide icon indicating copy to clipboard operation
libdivide copied to clipboard

constexpr potentianl improvements

Open pps83 opened this issue 10 months ago • 0 comments

I came a cross libdivide and I was curious how it compares with what compiler generates for constants. I made a few fixes and then added code for constexpr. With the constexpr compiler generates identical code when using compile time const directly, or, through libdivide.

I used this test code:

const unsigned freq = 3417614531;
const auto div_u = libdivide::libdivide_u64_gen(freq);
const auto div_ub = libdivide::libdivide_u64_branchfree_gen(freq);
const auto div_ux = libdivide::divider<uint64_t>(freq);
const auto div_ubx = libdivide::branchfree_divider<uint64_t>(freq);

unsigned toMicro(uint64_t ticks)
{
    return static_cast<unsigned>(ticks * 1000000 / freq);
}

unsigned toMicro_u(uint64_t ticks)
{
    return static_cast<unsigned>(libdivide::libdivide_u64_do(ticks * 1000000, &div_u));
}
// ... and other variants for div_ub , div_ux , div_ubx 

all variants for toMicro_xxx generated different code. However, when I added constexpr stuff, all of them now produce identical result. Here the test code with current master, and the same test code with constexpr changes.

To minimize changes, I hijacked LIBDIVIDE_INLINE to define it as constexpr LIBDIVIDE_INLINE to make it work.

I had a couple of questions/observations along the way.

1)

In many places code goes like this:

static LIBDIVIDE_INLINE struct libdivide_u64_t libdivide_u64_gen(uint64_t d);
...
// later on (note, LIBDIVIDE_INLINE is missing):
struct libdivide_u64_t libdivide_u64_gen(uint64_t d) {
    return libdivide_internal_u64_gen(d, 0);
}

imo, unlike static (which matters at the first declaration), inline has to go at definition site to matter.

2)

In my case, I had to do (uint64_t / uint32_t) -> uint32_t. I figured that I had to simply use uint64_t/uint64_t -> uint32_t. Could there be any optimizations for 64/32 case, or it has to be 64/64?

pps83 avatar Mar 15 '25 10:03 pps83