ucc icon indicating copy to clipboard operation
ucc copied to clipboard

Provide a NCCL-like initialization mechanism

Open almogsegal opened this issue 2 years ago • 5 comments

The existing OOB initialization may be a blocker for some users to adopt UCC. I suggest to add another initialization alternative that would allow users to query a unique identifier (ucp_address?) and do the communication themselves.

almogsegal avatar Jan 03 '23 07:01 almogsegal

@manjugv FYI.

almogsegal avatar Jan 03 '23 07:01 almogsegal

We discussed this in our WG. When you don’t give OOB, we need Allgather in the implementation and this feature is lacking in the implementation. This requires a lot of developer cycles, and we are trying to figure out an easy way to implement this feature. :)

manjugv avatar Jan 13 '23 22:01 manjugv

The existing OOB initialization may be a blocker for some users to adopt UCC. I suggest to add another initialization alternative that would allow users to query a unique identifier (ucp_address?) and do the communication themselves.

We don't expose ucp/ucx at the interface level as this will impact the portability. The UCC interfaces are agnostic of UCX.

manjugv avatar Jan 13 '23 22:01 manjugv

We discussed this in our WG. When you don’t give OOB, we need Allgather in the implementation and this feature is lacking in the implementation. This requires a lot of developer cycles, and we are trying to figure out an easy way to implement this feature. :)

I thought it might be possible to leverage the internal_oob or the internal team implementation for that.

We don't expose ucp/ucx at the interface level as this will impact the portability. The UCC interfaces are agnostic of UCX.

I didn't mean to expose UCP/UCP at the interface level but to use an opaque structure.

almogsegal avatar Feb 05 '23 09:02 almogsegal

Some more context on why the current interface is not a good fit for Legion/Legate.

The current UCC context creation API requires an out-of-bounds allgather callback. The obvious way to implement this is using MPI_Iallgather[^1]. However, for a number of reasons we would like to avoid introducing a dependency on MPI if we can help it.

In the general case it's not possible to provide an allgather callback (using purely Legion primitives) that is guaranteed to work in every context. Typically in a Legion application a single controller thread is managing all the resources in a node, and spawns a task per GPU/core to take part in communicator initialization. At the point where these tasks are running they are no longer able to pass data to each other (that's what we're trying to initialize a communicator to do in the first place). It might be possible to work around this using lower-level primitives, but that would start to veer off the "blessed" Legion path.

The NCCL model, where a single rank produces a value, that the calling code (externally to NCCL) broadcasts to all other ranks, can be much more easily fitted within Legion's task model.

At the very least we need to be in control of the out-of-bounds communication; having to provide a callback is what gets us into trouble. We could possibly build up an allgather on top of Legion primitives, but we would need to be in charge of invoking it.

[^1]: Note that this is not the case if multiple threads under the same process are taking part in the communicator, since AFAIU MPI collectives don't support this mode. Instead we have to build a up an allgather using point-to-point exchanges.

manopapad avatar Nov 21 '23 00:11 manopapad