libcudacxx
libcudacxx copied to clipboard
Replace uses of `__CUDA_ARCH__` and `__NVCOMPILER_CUDA_ARCH__` for compile time target version checks
We currently use __CUDA_ARCH__
/__NVCOMPILER_CUDA_ARCH__
in a few places that are difficult to replace with if target
:
- For some headers like
<cuda/[std/]atomic>
and<cuda/[std/]barrier>
, we need to produce a compile time error if the header is being compiled for an older SM target. - There are also some
memcpy_async
implementation details that are#if
'd out for older SM targets. I think we should probably just allow these to be present for all SM targets.- https://github.com/NVIDIA/libcudacxx/blob/feature/nvcxx-compatibility/include/cuda/std/barrier#L307
-
atomic_flag
's wait/notify member functions is only defined for newer targets. Note that we do NOT do this foratomic
, which is strange.- https://github.com/NVIDIA/libcudacxx/blob/feature/nvcxx-compatibility/libcxx/include/atomic#L2600
Possible solutions:
- Don't emit an error for older SMs with NVC++. This would lead to (possibly cryptic) compile time failures in some cases and runtime failures in some cases.
- Add some sort of compile time "do all targets provide"/"do any target provide" mechanism to
<nv/target>
that usesNV_TARGET_SM_INTEGER_LIST
instead to detect if any of the SMs in the list don't meet the requirements of the feature. This would require some preprocessor logic. - Add some sort of
static_assert_target
facility to NVC++. This wouldn't solve the case of thememcpy_async
overloads that should only be present for newer targets.