easybuild-framework icon indicating copy to clipboard operation
easybuild-framework copied to clipboard

We should set $CUDAARCHS environment variable (plus additional cuda_cc_ template)

Open Micket opened this issue 3 years ago • 2 comments

Credit to @mboisson for this recommendation

Ref. https://cmake.org/cmake/help/latest/envvar/CUDAARCHS.html https://cmake.org/cmake/help/latest/variable/CMAKE_CUDA_ARCHITECTURES.html https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html#prop_tgt:CUDA_ARCHITECTURES

The CUDAARCHS support was added into CMake just to "to avoid specifying it for every invocation" https://gitlab.kitware.com/cmake/cmake/-/merge_requests/5533 but as an environment variable it does also help in the case of nested builds. It's also less CMake specific, so who knows, maybe other builds systems would pick up on it. Heads up though; this will change all the default property for all cmake build targets if set. So extensive testing would be required.

Regardless of whether or not we add this variable, we are desperately missing the no-dot semicolon separated compute-capablities We currently have;

cuda_cc_semicolon_sep='7.0;8.6'
cuda_cc_space_sep='7.0 8.6'
cuda_compute_capabilities='7.0,8.6'
cuda_sm_comma_sep='sm_70,sm_86'
cuda_sm_space_sep='sm_70 sm_86'

and we need something

cuda_cc_cmake='70;86'

as well. It's especially important because there is simply no way at all do deal with this trivial removal of a dot, or switching to a semicolon from within an easyconfig. The template isn't expanded when that code runs, and we can't access the cuda_compute_capabilities either, forcing us to patch in custom code into buildscripts that does our made up format.

Micket avatar Sep 30 '22 18:09 Micket

Had to learn a bit about cmake; this variable is only used if the cmake scripts are using the new mode for building;

  1. Using enable_language(CUDA) or project(Foo LANGUAGES CUDA ...)
  2. Doesn't use the old find_package(CUDA) or any of the cuda_add_xxxx macros it defines. It should just be a normal add_.... It may optionall use find_package(CUDAToolkit) for linking to cuFFT etc.

I don't recall actually coming across any cmakelists that does this yet, but doesn't hurt to support this from the start, possibly patch stuff to do the right thing.

Micket avatar Oct 03 '22 09:10 Micket

To add to that from Slack:

I see some older easyconfigs making this mistake

# default CUDA compute capabilities to use (override via --cuda-compute-capabilities)
cuda_compute_capabilities = ['3.5', '3.7', '5.2', '6.0', '6.1', '7.0', '7.2', '7.5', '8.0']

# replace hardcoded CUDA compute capabilitites in liquidSVM
local_cuda_cc = [c.replace('.', '') for c in cuda_compute_capabilities]
local_cuda_arch = "-arch sm_%s" % local_cuda_cc[0]
local_cuda_gencode = ' '.join(['-gencode=arch=compute_%s,code=sm_%s' % (c, c) for c in local_cuda_cc])
local_liquidSVM_sed = "sed -i 's/-arch sm_30/%s %s/' src/Makevars.in" % (local_cuda_arch, local_cuda_gencode)

Hence we should check existing ECs for this wrong usage.

Flamefire avatar Oct 05 '22 08:10 Flamefire