celeritas icon indicating copy to clipboard operation
celeritas copied to clipboard

Correctly handle math function namespaces in host/hip/CUDA

Open sethrj opened this issue 3 months ago • 5 comments

As noted in the [CUDA math API[(https://docs.nvidia.com/cuda/cuda-math-api/index.html) and pointed it out by @amandalund more than once, we should not be using std::-namespaced math functions in CUDA code.

However, a problem arises when we try to compile the same code in multiple places, because the global namespace math functions do not do the same thing as the std namespace ones: they are strictly double precision. Furthermore some functions don't exist in std, and are overloaded in celeritas, and not overloaded in CUDA. Finally it's not clear whether HIP behaves like CUDA (global namespace, no overloads) or C++ (supporting overloads). Making matters worse, of course, is abs and the various implementations of it. See Why cstdlib is more complicated than you might think for more insight into this mess.

Here's the synopsis of our current situation with a couple of illustrative examples:

Function :: (<math.h>) std:: (<cmath>) :: (NVCC crt/math_functions.hpp) celeritas:: (corecel/Algorithms.hh)
fabs double  overloaded double
sincos double overloaded (also overloaded with Turn)
rsqrt double overloaded

So using ::fabs will (always? compiler dependent?) cast to double and return a double. ::sincos isn't available unless compiling with CUDA. Etc.

sethrj avatar Sep 09 '25 17:09 sethrj

Possible pathway is biting the bullet and adapting alpaka now that its maturity level is much higher than when we started in 2020.

sethrj avatar Sep 11 '25 12:09 sethrj

Would the NVIDIA C++ Standard Library be a possible solution? As of CUDA 12.3, it has support for cmath functions. Link to the documentation is here: https://docs.nvidia.com/cuda/cuda-c-std/index.html#overview

LSchwiebert avatar Oct 27 '25 13:10 LSchwiebert

Yeah actually that would be ideal.

sethrj avatar Oct 27 '25 16:10 sethrj

I'll look into it and see what changes are needed to use this.

LSchwiebert avatar Oct 27 '25 20:10 LSchwiebert

Thanks @LSchwiebert ! Since this hasn't yet caused any build/logic errors, I would say this is lower priority than the performance enhancements for async transport (and other changes like for syncwarp). Unless you're really itching to do this, I'd say let's defer it.

sethrj avatar Oct 27 '25 21:10 sethrj