FLAMEGPU2 icon indicating copy to clipboard operation
FLAMEGPU2 copied to clipboard

Improved CUDA ARCH error messages for CUDA 11.5+

Open ptheywood opened this issue 2 years ago • 0 comments

flamegpu/detail/compute_capability.cuh and associated .cpp and CMake code uses the minimum CUDA version selected at CMake configuration time to provide a more user friendly runtime error when a too old GPU is used. This is required as __CUDA_ARCH__ is only available during the appropriate nvcc compilation phase for the single compute_xy arch.

CUDA 11.5 introduced a new virtual architecture macro, __CUDA_ARCH_LIST__ which is available on host compilation passes (when invoked through nvcc, so .cu files), which contains a comma separated list of xy0 for the compute_xy gencodes passed.

This could be used to improve error messages in use-cases where the minimum cuda arch is not sufficient (or atleast improve the error message itself when CUDA 11.5+ is used). It does not include the real archi (code_xy) however so is still not perfect.

As it is a string defined in a macro without any quotes, need to stringify the macro to pass it to printf for instance

For example, the following nvcc compilation command line will define __CUDA_ARCH_LIST__ as 500,530,800 :

nvcc x.cu \
   --generate-code arch=compute_80,code=sm_80 \
   --generate-code arch=compute_50,code=sm_52 \
   --generate-code arch=compute_50,code=sm_50 \
   --generate-code arch=compute_53,code=sm_53

https://docs.nvidia.com/cuda/archive/11.5.0/cuda-compiler-driver-nvcc/index.html#virtual-architecture-macros

This is super low priority, just a feature I was previously unaware of.

ptheywood avatar Feb 23 '23 13:02 ptheywood