ucx icon indicating copy to clipboard operation
ucx copied to clipboard

Does cuda_ipc support GPU cards without NVLink?

Open Jeffwan opened this issue 2 weeks ago • 0 comments

Configuration

parser.c:2368 UCX  INFO  UCX_* env variables: UCX_PROTO_INFO=y UCX_LOG_LEVEL=debug UCX_TLS=cuda_ipc,cuda_copy,tcp

Logs

[1765333153.178129] [g340-cd51-4900-11aa-cea2-3aba-b2e8:1586361:2]      ucp_worker.c:1912 UCX  INFO    ucp_context_0 intra-node cfg#2 rma_am(tcp/eth1)  amo_am(tcp/eth1)  am(tcp/eth1 tcp/eth5 tcp/eth7 tcp/eth6 tcp/eth8 cuda_ipc/cuda)  ka(tcp/eth1)

seems cuda_ipc is not part of the data transfer protocols.

For RMA lanes (not RMA_BW), the wireup selection requires these flags (from select.c:1176-1180):
  UCT_IFACE_FLAG_PUT_SHORT |
  UCT_IFACE_FLAG_PUT_BCOPY |
  UCT_IFACE_FLAG_GET_BCOPY |
  UCT_IFACE_FLAG_PENDING

But cuda_ipc only has (from line 275-280):
  UCT_IFACE_FLAG_GET_ZCOPY |
  UCT_IFACE_FLAG_PUT_ZCOPY |
  UCT_IFACE_FLAG_PENDING

cuda_ipc is missing PUT_SHORT, PUT_BCOPY, and GET_BCOPY - it only supports zcopy operations. does it mean cuda_ipc is not selected for rma_am lane - it doesn't have the required short and bcopy capabilities that RMA lane selection require?

Jeffwan avatar Dec 10 '25 04:12 Jeffwan