ucx icon indicating copy to clipboard operation
ucx copied to clipboard

Running UCX only with ibverbs RC

Open planetA opened this issue 5 years ago • 13 comments

Hello,

for developing my small project, I would like to test MPI library support. I compiled OpenMPI with UCX 1.6.1. Due to limitations of my project, I need to make sure that among all ibverbs primitives UCX only uses Reliable Connections (RC). Namely, I want to make sure, that UCX does not rely on UD, DC connection types, or RDMA CM (librdmacm).

So far, I've been unsuccessful in making this setup work. I don't know what is the reason for that, but to exclude some of the option, I would be glad, if you could confirm that UCX is expected to work with RC only.

I'm trying to run the setup inside a docker container. Here is the dockerfile for creating the container. And as a Infiniband device I use SoftRoCE.

planetA avatar Feb 21 '20 11:02 planetA

When you select RC, UCX would use UD anyway as an auxiliary transport (to wireup RC connection)

brminich avatar Feb 21 '20 11:02 brminich

@brminich When I compile UCX, I disable quite a lot of stuff:

./configure --with-rc --with-ud=no --with-dc=no \
   --with-mlx5-dv=no --with-dm=no --with-ib-hw-tm=no \
    --with-rdmacm=no --enable-cma=no

The question is, would it break UCX, or make UCX use only RC and TCP?

planetA avatar Feb 21 '20 15:02 planetA

The question is, would it break UCX, or make UCX use only RC and TCP?

unfortunately, RC can't use TCP as an auxiliary transport for wireup procedure due to lack of UCT_IFACE_FLAG_CB_ASYNC support in TCP.

dmitrygx avatar Feb 21 '20 15:02 dmitrygx

@dmitrygx Just to be clear, is there any other auxiliary transport that is not ibverbs, but can be used by RC?

planetA avatar Feb 21 '20 15:02 planetA

@dmitrygx Just to be clear, is there any other auxiliary transport that is not ibverbs, but can be used by RC?

@planetA intra-node - yes, this is MM transports (e.g. posix/sysv/xpmem) inter-node - no, only UD transports (e.g. ud_mlx5/ud_verbs)

dmitrygx avatar Feb 21 '20 16:02 dmitrygx

OK. Thank you for the clarification.

planetA avatar Feb 21 '20 16:02 planetA

OK. Thank you for the clarification.

@planetA we could consider adding support for UCT_IFACE_FLAG_CB_ASYNC in UCT/TCP as feature request for the next releases to allow using UCT/TCP as an auxiliary transport in UCP and UCX_TLS=rc_verbs,tcp or UCX_TLS=rc_mlx5,tcp case will be possible (rc_verbs/rc_mlx5 will be used on data path, tcp - as an auxiliary transport to setup connections)

dmitrygx avatar Feb 21 '20 17:02 dmitrygx

@dmitrygx How do I make this kind of feature request? Open another issue?

planetA avatar Feb 23 '20 17:02 planetA

@dmitrygx How do I make this kind of feature request? Open another issue?

@planetA I think this issue contains enough info - I've reopened it and marked with the "Feature" label. thank you!

dmitrygx avatar Feb 24 '20 12:02 dmitrygx

Hi. I would like to ask kindly, if there is any progress with this feature?

planetA avatar Jul 04 '20 06:07 planetA

Hi. I would like to ask kindly, if there is any progress with this feature?

@planetA unfortunately, there is no progress on this will update you when we start implementing this

dmitrygx avatar Oct 21 '20 17:10 dmitrygx

Hi. I would like to ask kindly, if there is any progress with this feature?

@planetA unfortunately, there is no progress on this will update you when we start implementing this

Kindly ask if there is any update on this?

howardlau1999 avatar Mar 12 '25 15:03 howardlau1999

https://github.com/openucx/ucx/pull/9061#issuecomment-1691692832

changchengx avatar Mar 13 '25 01:03 changchengx