superlu_dist
superlu_dist copied to clipboard
Build with CUDA 11.5 fails with "no instance of overloaded function matches the argument list"
I just tried to compile Superlu_dist (recent pull from git) with CUDA 11.5 (for debugging purposes), but compilation fails with
~/Downloads/git-files/superlu_dist/SRC/pdgstrs_lsum_cuda.cu(1280): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (double *, double)
~/Downloads/git-files/superlu_dist/SRC/pdgstrs_lsum_cuda.cu(1385): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (double *, double)
~/Downloads/git-files/superlu_dist/SRC/pdgstrs_lsum_cuda.cu(1446): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (double *, double)
~/Downloads/git-files/superlu_dist/SRC/pdgstrs_lsum_cuda.cu(1738): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (double *, double)
If I compile without cuda, everything works fine. I attached the CMakeCache-file. CMakeCache.txt
We have a lot of updates lately. Can you please try again with master branch? It should work with CUDA 11.
cc: @liuyangzhuan
I came across this same issue. For me, the problem was that the CMAKE_CUDA_ARCHITECUTRES variable was set too low. Based on the output from running CMake to configure SuperLU, it seemed like the right CUDA arch was detected, but CMAKE_CUDA_ARCHITECUTRES was not getting set accordingly. When I explicitly set CMAKE_CUDA_ARCHITECUTRES to 60 or higher (atomicAdd with doubles requires compute capability 6.0 or higher) then everything works as expected.
I suspect https://github.com/xiaoyeli/superlu_dist/blob/master/CMakeLists.txt#L188 is not fully compatible with newer CMake.
I am not sure how to resolve this. I would thinking using "Auto" is the best way, which should capture the underlying GPU version. @liuyangzhuan : Do you know a better way of setting CUDA_ARCH_FLAGS ?
I don't know if one can set that automatically. In the past, we always manually set CMAKE_CUDA_ARCHITECUTRES
Perhaps the <out_variable> is wrong:
Syntax: cuda_select_nvcc_arch_flags(<out_variable> [<target_CUDA_architecture> ...]) (see here: https://cmake.org/cmake/help/latest/module/FindCUDA.html)
Our use: cuda_select_nvcc_arch_flags(CUDA_ARCH_FLAGS Auto)
Perhaps change CUDA_ARCH_FLAGS to CMAKE_CUDA_ARCHITECUTRES: cuda_select_nvcc_arch_flags(CMAKE_CUDA_ARCHITECUTRES Auto)
My primary concern is not that the automatic detection does not work, but that the CMake output (produced by calling cuda_select_nvcc_arch_flags) makes it seem like it set the architecture when it did not. If SuperLU just requires CMAKE_CUDA_ARHCITECTURES to be set and stops calling cuda_select_nvcc_arch_flags thats fine with me.
However, I do not think you can just change the output variable to CMAKE_CUDA_ARCHITECTURES as it expects a list of numbers like 60;70;80 etc, while I believe cuda_select_nvcc_arch_flags produces the actual flags (e.g. sm_70,compute_70). You may be interested in this discussion about the state of automatic CUDA architecture detection in CMake as well as this documentation page.
Hi, Cody, Thanks for the information. I made a commit based on https://stackoverflow.com/questions/68223398/how-can-i-get-cmake-to-automatically-detect-the-value-for-cuda-architectures See https://github.com/xiaoyeli/superlu_dist/commit/4fe14ba02cfedb7d60cfcdc8d7a6cb221010d3db It works for me now, can you give it a try?
As you pointed out, for future CMake versions (>=3.24) we can just do set_property(TARGET tgt PROPERTY CUDA_ARCHITECTURES native) We will keep this in mind once 3.24 is available on various platforms.
@liuyangzhuan The architecture is found and set so it compiles with that commit (since my architecture is 70), however, the commit makes it impossible to manually set CMAKE_CUDA_ARCHITECTURES because the automatic detection always always overrides it.
Good catch. This should fix it: https://github.com/xiaoyeli/superlu_dist/commit/6cd15699dd071431630d1bb25a0d83ce808358ab