ucx icon indicating copy to clipboard operation
ucx copied to clipboard

UCT/IB: Support cuda-managed memory when ODP is available

Open tvegas1 opened this issue 6 months ago • 5 comments

What

Allow memory registration for cuda-managed memory when ODP is enabled on machines with coherent memory.

Why ?

Needed to enable rdma, avoiding long-term page pinning.

How ?

Advertise support, and make sure non-blocking registration is always requested for cuda-managed, since it currently fails otherwise.

Test

mpirun -mca coll ^hcoll -mca pml ucx -np 2 osu_bw -d managed MD MD
./bin/ucx_perftest -m cuda-managed -t <tag_bw|ucp_put_bw|ucp_get>
  • Confirmed actual RDMA operation against cuda-managed memory with ODPv1
  • enable rdma assertion on gtest
  • tested on GH200 / A100 without coherent.
  • tested with UCX_IB_ODP_PREFETCH=y

tvegas1 avatar Aug 23 '24 11:08 tvegas1