ucc icon indicating copy to clipboard operation
ucc copied to clipboard

CUDA: support for lazy init

Open Sergei-Lebedev opened this issue 1 year ago • 0 comments

What

Lazily initialize TL NCCL and TL CUDA on first CUDA collective.

Why ?

Both NCCL and CUDA require CUDA devices to be set before team create. In MPI workloads it's not always possible since MPI_Init creates UCC team and to set device we need to know rank and local rank.

Sergei-Lebedev avatar Mar 27 '23 11:03 Sergei-Lebedev