[FEA]: cuda::ceil_div
Is this a duplicate?
- [x] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct
Area
libcu++
Is your feature request related to a problem? Please describe.
Every copy pastes this a million times, we should just include it:
__host__ __device__ constexpr size_t ceil_div(size_t a, size_t b) { return (a + b - 1)/b; }
Describe the solution you'd like
Describe alternatives you've considered
No response
Additional context
No response
I dont think this is a correct implementation, if a+b-1 overflows then this will create bad outcomes
Should this be
__host__ __device__ constexpr size_t ceil_div(size_t a, size_t b) { return a / b + a % b ? 1 : 0 ; }
This one is getting closer (see also some tests, brainstorming with @miscco ): https://gcc.godbolt.org/z/vxszn8eqE
Since I have also written such a function countless times, I honestly think it should just belong into the standard. Hit me up if you want to draft a proposal.
Next on the list is round_up_to_multiple(n, mult) -> ceil_div(n, mult) * n.
There was cub::DivideAndRoundUp() already which is quite similar. Maybe the two should be unified?
@pauleonix I already made that comment in the code you referred to :) The difference is that the CUB version accepts mixed types, and libcu++ doesn't. But this could be fixed in libcu++.