ucx icon indicating copy to clipboard operation
ucx copied to clipboard

UCT/IB/EFA: Add EFA MD as an IB module with UD support

Open tvegas1 opened this issue 7 months ago • 1 comments

What

Add EFA device memory domain and UD support.

(relates to #6353)

Why ?

EFA devices have specific capabilities which need specific memory domain.

How ?

The PR contains:

  • Add of IB EFA module
  • Add of IB EFA query and MD setup
  • Add of UD/UD verbs specific code to support ooo and missing notification capability

Tested

RPM

$ contrib/buildrpm.sh -t
$ contrib/buildrpm.sh --binrpm
rpm -qp --requires rpm-dist/x86_64/ucx-ib-efa-1.18.0-1.el7.x86_64.rpm  | grep -E '(efa|verbs)'
libefa.so.1()(64bit)
libefa.so.1(EFA_1.1)(64bit)
libibverbs.so.1()(64bit)
libibverbs.so.1(IBVERBS_1.1)(64bit)
$ rpm -qp --provides rpm-dist/x86_64/ucx-ib-efa-1.18.0-1.el7.x86_64.rpm
libuct_ib_efa.so.0()(64bit)
ucx-ib-efa = 1.18.0-1.el7
ucx-ib-efa(x86-64) = 1.18.0-1.el7
$ rpm -qp --requires rpm-dist/x86_64/ucx-ib-1.18.0-1.el7.x86_64.rpm  | grep efa
<none>

Traffic

  • perftest traffic with various sizes:
    • ucx_perftest -t tag_lat peer_host -s $((1024)) -n 100000
  • gtest with --gtest_filter=*ud* on AWS (final rerun needed)
  • for now existing GDR is declared detected in logs

tvegas1 avatar Jul 17 '24 18:07 tvegas1