ompi
ompi copied to clipboard
SEGV in ompi_request_default_test_all() when triggering IPoIB networking problem during ucc_perftest run
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
v4.1.7rc1
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Git clone.
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
Please describe the system on which you are running
- Operating system/version: Ubuntu 24.04.2 LTS (Noble Numbat), Kernel 6.8.0-57-generic
- Computer hardware:
- Network type: IPoIB
Details of the problem
With the following reproducer with ucc_perftest :
$ srun -A admin -p admin -N64 --mpi=pmix --ntasks-per-node=8 --container-image=<container-image> env UCX_TLS=self,tcp ucc_perftest -c alltoall -m host -b 1048576 -e 2147483648 -n 2
srun: job 357102 queued and waiting for resources
srun: job 357102 has been allocated resources
[1744257072.724104] [node0190:1259783:0] sock.c:334 UCX ERROR connect(fd=252, dest_addr=<ip address>) failed: Connection timed out
[node0190.<domain>:1259783] pml_ucx.c:424 Error: ucp_ep_create(proc=496) failed: Destination is unreachable
[node0190.<domain>:1259783] pml_ucx.c:477 Error: Failed to resolve UCX endpoint for rank 496 [LOG_CAT_COMMPATTERNS] isend failed in comm_allreduce_pml at iterations 7
.................
that leads to corefiles being created on several nodes. Thus I have been able to dig into these corefiles to try to understand the reason of the SEGVs, likely to be caused by a wrong/missing error-path in OMPI/UCC code, when triggering the original error/msgs above upon some IPoIB networking issue.
Here are my findings. The fully unwinded stack is like following :
#0 ompi_request_default_test_all (count=2, requests=0x555555a2f228, completed=0x7fffffffc5c4, statuses=0x0) at request/req_test.c:187
#1 0x00007ffff50139ac in oob_allgather_test (req=0x555555a2f200) at coll_ucc_module.c:182
#2 0x00007ffff7f8ea5c in ucc_core_addr_exchange (context=context@entry=0x555555a2e990, oob=oob@entry=0x555555a2e9a8, addr_storage=addr_storage@entry=0x555555a2eaa0) at core/ucc_context.c:461
#3 0x00007ffff7f8f657 in ucc_context_create_proc_info (lib=0x5555559d12b0, params=params@entry=0x7fffffffc960, config=0x555555a2e840, context=context@entry=0x7ffff50213c8 <mca_coll_ucc_component+392>, proc_info=0x7ffff7fbca60 <ucc_local_proc>)
at core/ucc_context.c:723
#4 0x00007ffff7f901f0 in ucc_context_create (lib=<optimized out>, params=params@entry=0x7fffffffc960, config=<optimized out>, context=context@entry=0x7ffff50213c8 <mca_coll_ucc_component+392>) at core/ucc_context.c:866
#5 0x00007ffff5013cb1 in mca_coll_ucc_init_ctx () at coll_ucc_module.c:302
#6 0x00007ffff501583f in mca_coll_ucc_comm_query (comm=0x55555557d240 <ompi_mpi_comm_world>, priority=0x7fffffffcb6c) at coll_ucc_module.c:488
#7 0x00007ffff7ee5e4c in query_2_0_0 (module=<synthetic pointer>, priority=0x7fffffffcb6c, comm=0x55555557d240 <ompi_mpi_comm_world>, component=0x7ffff5021240 <mca_coll_ucc_component>) at base/coll_base_comm_select.c:540
#8 query (module=<synthetic pointer>, priority=0x7fffffffcb6c, comm=<optimized out>, component=0x7ffff5021240 <mca_coll_ucc_component>) at base/coll_base_comm_select.c:523
#9 check_one_component (module=<synthetic pointer>, component=0x7ffff5021240 <mca_coll_ucc_component>, comm=<optimized out>) at base/coll_base_comm_select.c:486
#10 check_components (comm=comm@entry=0x55555557d240 <ompi_mpi_comm_world>, components=<optimized out>) at base/coll_base_comm_select.c:406
#11 0x00007ffff7ee6446 in mca_coll_base_comm_select (comm=0x55555557d240 <ompi_mpi_comm_world>) at base/coll_base_comm_select.c:114
#12 0x00007ffff7f33613 in ompi_mpi_init (argc=<optimized out>, argc@entry=0, argv=<optimized out>, argv@entry=0x0, requested=0, provided=0x7fffffffcdf4, reinit_ok=reinit_ok@entry=false) at runtime/ompi_mpi_init.c:957
#13 0x00007ffff7ed6c2c in PMPI_Init (argc=0x0, argv=0x0) at pinit.c:69
#14 0x000055555555dbf4 in ucc_pt_bootstrap_mpi::ucc_pt_bootstrap_mpi() ()
#15 0x0000555555565666 in ucc_pt_comm::ucc_pt_comm(ucc_pt_comm_config) ()
#16 0x0000555555558f2a in main ()
where you can see that the unresolved symbol/frame in previously detailed stack is in fact in oob_allgather_test().
And the reason of the SEGV is because :
(gdb) p/x *(oob_allgather_req_t *)0x555555a2f200
$1 = {sbuf = 0x555555a2ea00, rbuf = 0x555555a710c0, oob_coll_ctx = 0x55555557d240, msglen = 0x8, iter = 0x1, reqs = {0x726568, 0x555555a8fa48}}
where reqs[0] is garbage when being dereferenced :
(gdb) p/x $rip
$3 = 0x7ffff7eb39e8
(gdb) x/10i ($rip - 0x18)
0x7ffff7eb39d0 <ompi_request_default_test_all+48>: cmpq $0x1,0x58(%rax)
0x7ffff7eb39d5 <ompi_request_default_test_all+53>: je 0x7ffff7eb39f0 <ompi_request_default_test_all+80>
0x7ffff7eb39d7 <ompi_request_default_test_all+55>: lea 0x1(%r12),%rax
0x7ffff7eb39dc <ompi_request_default_test_all+60>: cmp %rax,%rdi
0x7ffff7eb39df <ompi_request_default_test_all+63>: je 0x7ffff7eb39fe <ompi_request_default_test_all+94>
0x7ffff7eb39e1 <ompi_request_default_test_all+65>: mov %rax,%r12
0x7ffff7eb39e4 <ompi_request_default_test_all+68>: mov (%rbx,%r12,8),%rax
=> 0x7ffff7eb39e8 <ompi_request_default_test_all+72>: mov 0x60(%rax),%esi
0x7ffff7eb39eb <ompi_request_default_test_all+75>: cmp $0x1,%esi
0x7ffff7eb39ee <ompi_request_default_test_all+78>: jne 0x7ffff7eb39d0 <ompi_request_default_test_all+48>
(gdb) x/gx ($rax + 0x60)
0x7265c8: Cannot access memory at address 0x7265c8
(gdb) p/x $rbx + $r12 * 0x8
$4 = 0x555555a2f228
(gdb) x/gx ($rbx + $r12 * 0x8)
0x555555a2f228: 0x0000000000726568
(gdb) p/x $rax
$5 = 0x726568
(gdb) x/gx ($rax + 0x60)
0x7265c8: Cannot access memory at address 0x7265c8
(gdb)
Looking at the corresponding source code in "ompi/mca/coll/ucc/coll_ucc_module.c" :
141
142 typedef struct oob_allgather_req{
143 void *sbuf;
144 void *rbuf;
145 void *oob_coll_ctx;
146 size_t msglen;
147 int iter;
148 ompi_request_t *reqs[2];
149 } oob_allgather_req_t;
150
151 static ucc_status_t oob_allgather_test(void *req)
152 {
153 oob_allgather_req_t *oob_req = (oob_allgather_req_t*)req;
154 ompi_communicator_t *comm = (ompi_communicator_t *)oob_req->oob_coll_ctx;
155 char *tmpsend = NULL;
156 char *tmprecv = NULL;
157 size_t msglen = oob_req->msglen;
158 int probe_count = 5;
159 int rank, size, sendto, recvfrom, recvdatafrom,
160 senddatafrom, completed, probe;
161
162 size = ompi_comm_size(comm);
163 rank = ompi_comm_rank(comm);
164 if (oob_req->iter == 0) {
165 tmprecv = (char*) oob_req->rbuf + (ptrdiff_t)rank * (ptrdiff_t)msglen;
166 memcpy(tmprecv, oob_req->sbuf, msglen);
167 }
168 sendto = (rank + 1) % size;
169 recvfrom = (rank - 1 + size) % size;
170 for (; oob_req->iter < size - 1; oob_req->iter++) {
171 if (oob_req->iter > 0) { <<<< iter is 0 for 1st loop ...
172 probe = 0;
173 do {
174 ompi_request_test_all(2, oob_req->reqs, &completed, MPI_STATUS_IGNORE);
<<<<<< during 2nd loop (iter == 1) , ompi_request_test_all() is called with garbled reqs[0] !!
175 probe++;
176 } while (!completed && probe < probe_count);
177 if (!completed) {
178 return UCC_INPROGRESS;
179 }
180 }
181 recvdatafrom = (rank - oob_req->iter - 1 + size) % size;
182 senddatafrom = (rank - oob_req->iter + size) % size;
183 tmprecv = (char*)oob_req->rbuf + (ptrdiff_t)recvdatafrom * (ptrdiff_t)msglen;
184 tmpsend = (char*)oob_req->rbuf + (ptrdiff_t)senddatafrom * (ptrdiff_t)msglen;
185 MCA_PML_CALL(isend(tmpsend, msglen, MPI_BYTE, sendto, MCA_COLL_BASE_TAG_UCC,
186 MCA_PML_BASE_SEND_STANDARD, comm, &oob_req->reqs[0]));
<<<<<< isend triggers an error so reqs[0] is not populated !!
187 MCA_PML_CALL(irecv(tmprecv, msglen, MPI_BYTE, recvfrom,
188 MCA_COLL_BASE_TAG_UCC, comm, &oob_req->reqs[1]));
<<<<<< irecv do not report error, so reqs[1] is populated.
189 }
190 probe = 0;
191 do {
192 ompi_request_test_all(2, oob_req->reqs, &completed, MPI_STATUS_IGNORE);
193 probe++;
194 } while (!completed && probe < probe_count);
195 if (!completed) {
196 return UCC_INPROGRESS;
197 }
198 return UCC_OK;
199 }
200
201 static ucc_status_t oob_allgather_free(void *req)
202 {
203 free(req);
204 return UCC_OK;
205 }
206
207 static ucc_status_t oob_allgather(void *sbuf, void *rbuf, size_t msglen,
208 void *oob_coll_ctx, void **req)
209 {
210 oob_allgather_req_t *oob_req = malloc(sizeof(*oob_req));
211 oob_req->sbuf = sbuf;
212 oob_req->rbuf = rbuf;
213 oob_req->msglen = msglen;
214 oob_req->oob_coll_ctx = oob_coll_ctx;
215 oob_req->iter = 0;
216 *req = oob_req;
217 return UCC_OK;
218 }
219
"ompi/mca/coll/ucc/coll_ucc_module.c" 528 lines --41%-- 219,0-1 37%
and just to be complete :
#define ompi_request_test_all (ompi_request_functions.req_test_all)
"ompi/request/request.h" 504L, 19446B 407,1 83%
(gdb) x/i ompi_request_functions.req_test_all
0x7ffff7eb39a0 <ompi_request_default_test_all>: endbr64
Based on all of this it appears that the following patch/correction (in v4.1.7rc1, the quite recent OMPI version we are running) would allow OMPI/UCC to no longer coredump by gracefully handling any error during isend/irecv :
~/ompi$ git status
HEAD detached at v4.1.7rc1
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: ompi/mca/coll/ucc/coll_ucc_module.c
no changes added to commit (use "git add" and/or "git commit -a")
~/ompi$ git diff
diff --git a/ompi/mca/coll/ucc/coll_ucc_module.c b/ompi/mca/coll/ucc/coll_ucc_module.c
index 1686697618..dfa2674a3d 100644
--- a/ompi/mca/coll/ucc/coll_ucc_module.c
+++ b/ompi/mca/coll/ucc/coll_ucc_module.c
@@ -158,6 +158,7 @@ static ucc_status_t oob_allgather_test(void *req)
int probe_count = 5;
int rank, size, sendto, recvfrom, recvdatafrom,
senddatafrom, completed, probe;
+ int rc;
size = ompi_comm_size(comm);
rank = ompi_comm_rank(comm);
@@ -182,10 +183,12 @@ static ucc_status_t oob_allgather_test(void *req)
senddatafrom = (rank - oob_req->iter + size) % size;
tmprecv = (char*)oob_req->rbuf + (ptrdiff_t)recvdatafrom * (ptrdiff_t)msglen;
tmpsend = (char*)oob_req->rbuf + (ptrdiff_t)senddatafrom * (ptrdiff_t)msglen;
- MCA_PML_CALL(isend(tmpsend, msglen, MPI_BYTE, sendto, MCA_COLL_BASE_TAG_UCC,
+ rc = MCA_PML_CALL(isend(tmpsend, msglen, MPI_BYTE, sendto, MCA_COLL_BASE_TAG_UCC,
MCA_PML_BASE_SEND_STANDARD, comm, &oob_req->reqs[0]));
- MCA_PML_CALL(irecv(tmprecv, msglen, MPI_BYTE, recvfrom,
+ if (OMPI_SUCCESS != rc) return rc
+ rc = MCA_PML_CALL(irecv(tmprecv, msglen, MPI_BYTE, recvfrom,
MCA_COLL_BASE_TAG_UCC, comm, &oob_req->reqs[1]));
+ if (OMPI_SUCCESS != rc) return rc
}
probe = 0;
do {
@@ -213,6 +216,8 @@ static ucc_status_t oob_allgather(void *sbuf, void *rbuf, size_t msglen,
oob_req->msglen = msglen;
oob_req->oob_coll_ctx = oob_coll_ctx;
oob_req->iter = 0;
+ oob_req->reqs[0] = NULL;
+ oob_req->reqs[1] = NULL;
*req = oob_req;
return UCC_OK;
}
~/ompi$
@bfaccini was ompi built with ucc? if so, I think you need to disable ucc via --mca coll ^ucc to run ucc perf tests.
https://github.com/open-mpi/ompi/pull/13194
was ompi built with ucc?
It looks like :
$ ompi_info
Package: Open MPI root@sharp-ci-01 Distribution
Open MPI: 4.1.7rc1
Open MPI repo revision: v4.1.5-175-ga2335dd1c5
Open MPI release date: Unreleased developer copy
Open RTE: 4.1.7rc1
Open RTE repo revision: v4.1.5-175-ga2335dd1c5
Open RTE release date: Unreleased developer copy
OPAL: 4.1.7rc1
OPAL repo revision: v4.1.5-175-ga2335dd1c5
OPAL release date: Unreleased developer copy
MPI API: 3.1.0
Ident string: 4.1.7rc1
Prefix: /opt/hpcx/ompi
Configured architecture: x86_64-pc-linux-gnu
Configure host: sharp-ci-01
Configured by: root
Configured on: Wed Dec 25 17:24:17 UTC 2024
Configure host: sharp-ci-01
Configure command line: '--prefix=/build-result/hpcx-v2.22-gcc-inbox-ubuntu24.04-cuda12-x86_64/ompi' '--with-libevent=internal' '--enable-mpi1-compatibility' '--without-xpmem' '--with-cuda=/hpc/local/oss/cuda12.6.1/ubuntu24.04' '--with-slurm' '--with-platform=contrib/platform/mellanox/optimized' '--with-hcoll=/build-result/hpcx-v2.22-gcc-inbox-ubuntu24.04-cuda12-x86_64/hcoll' '--with-ucx=/build-result/hpcx-v2.22-gcc-inbox-ubuntu24.04-cuda12-x86_64/ucx' '--with-ucc=/build-result/hpcx-v2.22-gcc-inbox-ubuntu24.04-cuda12-x86_64/ucc'
Built by:
Built on: Wed Dec 25 17:37:46 UTC 2024
Built host: sharp-ci-01
C bindings: yes
C++ bindings: no
Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: yes
Fort mpi_f08 compliance: The mpi_f08 module is available, but due to limitations in the gfortran compiler and/or Open MPI, does not support the following: array subsections, direct passthru (where possible) to underlying Open MPI's C functionality
Fort mpi_f08 subarrays: no
Java bindings: no
Wrapper compiler rpath: runpath
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C compiler family name: GNU
C compiler version: 13.2.0
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fort compiler: gfortran
Fort compiler abs: /usr/bin/gfortran
Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
Fort 08 assumed shape: yes Fort optional args: yes
Fort INTERFACE: yes
Fort ISO_FORTRAN_ENV: yes
Fort STORAGE_SIZE: yes
Fort BIND(C) (all): yes
Fort ISO_C_BINDING: yes
Fort SUBROUTINE BIND(C): yes
Fort TYPE,BIND(C): yes
Fort T,BIND(C,name="a"): yes
Fort PRIVATE: yes
Fort PROTECTED: yes
Fort ABSTRACT: yes
Fort ASYNCHRONOUS: yes
Fort PROCEDURE: yes
Fort USE...ONLY: yes
Fort C_FUNLOC: yes
Fort f08 using wrappers: yes
Fort MPI_SIZEOF: yes
C profiling: yes
C++ profiling: no
Fort mpif.h profiling: yes
Fort use mpi profiling: yes
Fort use mpi_f08 prof: yes
C++ exceptions: no
Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes, OMPI progress: no, ORTE progress: yes, Event lib: yes)
Sparse Groups: no
Internal debug support: no
MPI interface warnings: yes
MPI parameter check: never
Memory profiling support: no
Memory debugging support: no
dl support: yes
Heterogeneous support: no
mpirun default --prefix: yes
MPI_WTIME support: native
Symbol vis. support: yes
Host topology support: yes
IPv6 support: no
MPI1 compatibility: yes
MPI extensions: affinity, cuda, pcollreq
FT Checkpoint support: no (checkpoint thread: no)
C/R Enabled Debugging: no
MPI_MAX_PROCESSOR_NAME: 256
MPI_MAX_ERROR_STRING: 256 MPI_MAX_OBJECT_NAME: 64
MPI_MAX_INFO_KEY: 36
MPI_MAX_INFO_VAL: 256
MPI_MAX_PORT_NAME: 1024
MPI_MAX_DATAREP_STRING: 128
MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA btl: self (MCA v2.1.0, API v3.1.0, Component v4.1.7)
MCA btl: smcuda (MCA v2.1.0, API v3.1.0, Component v4.1.7)
MCA btl: tcp (MCA v2.1.0, API v3.1.0, Component v4.1.7)
MCA btl: vader (MCA v2.1.0, API v3.1.0, Component v4.1.7)
MCA compress: bzip (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA compress: gzip (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA crs: none (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA event: libevent2022 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA hwloc: hwloc201 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA pmix: flux (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pmix: pmix3x (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rcache: gpusm (MCA v2.1.0, API v3.3.0, Component v4.1.7)
MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v4.1.7)
MCA rcache: rgpusm (MCA v2.1.0, API v3.3.0, Component v4.1.7)
MCA reachable: netlink (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA timer: linux (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA ess: env (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA ess: tool (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA iof: orted (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA iof: tool (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA odls: default (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA odls: pspawn (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA regx: fwd (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA regx: naive (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA regx: reverse (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rml: oob (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA routed: binomial (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA routed: direct (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA routed: radix (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA schizo: flux (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA schizo: jsm (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA schizo: orte (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA state: app (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA state: novm (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA state: orted (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA state: tool (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: adapt (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: basic (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: cuda (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: han (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: hcoll (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: inter (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: monitoring (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: self (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: sync (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA coll: ucc (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA io: romio321 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA op: avx (MCA v2.1.0, API v1.0.0, Component v4.1.7)
MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA osc: ucx (MCA v2.1.0, API v3.0.0, Component v4.1.7)
MCA pml: v (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA pml: ucx (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA rte: orte (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v4.1.7)
MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v4.1.7)
MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component v4.1.7)
MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component v4.1.7)
I think you need to disable ucc via --mca coll ^ucc to run ucc perf tests
I don't get it, do you mean I need to use "--mca coll ^ucc" at build or run time ?
And anyway, don't you think that analysis and fix proposal are ok ?
I'm not sure the patch is enough. UCC uses UCX to move data, and UCX is also used by OMPI as the default PML. The collective framework being initialized after the PML, I wonder how the connection establishment succeeded during the PML setup but then failed during UCC setup ?
If you disable UCC is this test succeeding and the connection between your processes is correctly established ?
Actually, nevermind - I confirmed with my colleagues that UCC presence at coll ompi won't affect the functionality of test itself.