libcudacxx icon indicating copy to clipboard operation
libcudacxx copied to clipboard

Replace uses of `__CUDA_ARCH__` and `__NVCOMPILER_CUDA_ARCH__` for compile time target version checks

Open brycelelbach opened this issue 3 years ago • 0 comments

We currently use __CUDA_ARCH__/__NVCOMPILER_CUDA_ARCH__ in a few places that are difficult to replace with if target:

  • For some headers like <cuda/[std/]atomic> and <cuda/[std/]barrier>, we need to produce a compile time error if the header is being compiled for an older SM target.
  • There are also some memcpy_async implementation details that are #if'd out for older SM targets. I think we should probably just allow these to be present for all SM targets.
    • https://github.com/NVIDIA/libcudacxx/blob/feature/nvcxx-compatibility/include/cuda/std/barrier#L307
  • atomic_flag's wait/notify member functions is only defined for newer targets. Note that we do NOT do this for atomic, which is strange.
    • https://github.com/NVIDIA/libcudacxx/blob/feature/nvcxx-compatibility/libcxx/include/atomic#L2600

Possible solutions:

  • Don't emit an error for older SMs with NVC++. This would lead to (possibly cryptic) compile time failures in some cases and runtime failures in some cases.
  • Add some sort of compile time "do all targets provide"/"do any target provide" mechanism to <nv/target> that uses NV_TARGET_SM_INTEGER_LIST instead to detect if any of the SMs in the list don't meet the requirements of the feature. This would require some preprocessor logic.
  • Add some sort of static_assert_target facility to NVC++. This wouldn't solve the case of the memcpy_async overloads that should only be present for newer targets.

brycelelbach avatar Mar 29 '21 17:03 brycelelbach