unified-runtime [CUDA] Remove the support of concurrent atomic access to host allocated pinned memory.

Addresses the removal of UR_DEVICE_USM_ACCESS_CAPABILITY_FLAG_ATOMIC_CONCURRENT_ACCESS from the supported features of host memory. This change is prompted by the incompatibility of CUDA with concurrent atomic access between host and device to pinned memory.

The CUDA adapter uses cuMemAllocHost for host memory allocation, which allocates page-locked host memory. However, this type of memory does not support concurrent atomic access, as detailed in this related discussion.

Furthermore, this issue shows in SYCL-CTS-USM when check_atomic_access is specialized with sycl::usm::alloc::host.

Apr 26 '24 09:04 mmoadeli

What about the HIP ?

Can you please add the following check in the HIP backend ? Thanks.

https://github.com/ROCm/HIP/issues/3244#issuecomment-2077559740

Apr 26 '24 22:04 jinz2014

What about the HIP ?

Can you please add the following check in the HIP backend ? Thanks.

ROCm/HIP#3244 (comment)

I'd prefer to handle ROCm/HIP#3244 (comment) in another patch.

The failing SYCL-CTS on CUDA backend pass on HIP backend, that's why I did not modify it for HIP.

Apr 29 '24 08:04 mmoadeli

Ok. Thanks.

Apr 29 '24 12:04 jinz2014

@kbenzie a friendly request to have this merged please.

Jun 03 '24 08:06 mmoadeli