ucx icon indicating copy to clipboard operation
ucx copied to clipboard

UCM/CUDA/TEST: Install memory hooks for async Cuda allocations

Open yosefe opened this issue 3 years ago • 6 comments

Why

As discussed in #7194 and #7110 , need to add memory hooks support for cuda async allocations. Without this, applications using these allocations may fail to detect Cuda memory and run into segfault/access error.

yosefe avatar Aug 07 '21 15:08 yosefe

@Akshay-Venkatesh WDYT?

yosefe avatar Aug 09 '21 09:08 yosefe

/azp run

yosefe avatar Aug 09 '21 19:08 yosefe

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines[bot] avatar Aug 09 '21 19:08 azure-pipelines[bot]

@yosefe forgot to bring up the issue of lack of sync memops support on MallocAsync memory that may come up because of this PR. Adding this PR would likely result in IB or cuda-ipc UCTs to be used to move memory allocated through MallocAsync but the following sequence could lead to stale data being transferred:

cudaMallocAsync(&x, length1, stream1);
cudaStreamSynchromize(stream1);
...
cudaMemcpy(x, y, length2, cudaMemcpyHostToDevice); // potentially non-blocking wrt CPU and copy to destination x may still be in flight
ucp_tag_send_nbx(x, ...); // region pointed by x is not valid yet because previous memcpy is still in flight

Setting sync memops attribute on x would synchronize all outstanding memory operations on it but it's not supported on MallocAsync memory so this could lead to data validation issues irrespective of zcopy operations through ib/cuda_ipc or through pipeline protocols.

Akshay-Venkatesh avatar Aug 10 '21 01:08 Akshay-Venkatesh

Any update on this?

simonbyrne avatar May 06 '22 17:05 simonbyrne

Any update on this?

@simonbyrne SYNC_MEMOPS is still yet to be supported with Malloc Async API. We plan to support such memory once it becomes available.

Akshay-Venkatesh avatar May 06 '22 18:05 Akshay-Venkatesh

replaced by #8623

yosefe avatar Nov 02 '22 16:11 yosefe