Alex Margolin
Alex Margolin
I've encountered this problem too - on first glance it looks like this is the offending commit: https://github.com/openucx/ucx/commit/ed2011b70c43a480bae4b1f2cc1fa6851ce30235 (@evgeny-leksikov for attention) Specifically, I observed it during MPI_Finalize, when UCX seems...
@yosefe I'm seeing this with CX-6, and I'm not sure it's a HW-related issue. I'll run ucx_perftest too, I see this on osu_bw. Right now it looks like this UD...
Confirmed also in `ucx_perftest` CX: ``` UCX_NET_DEVICES=mlx5_0:1 UCX_LOG_LEVEL=req /shared/alexm/applications/ucx_perftest/ucx_perftest -t ucp_am_bw [1713684825.142475] [srv25-r203:480403:0] debug.c:1152 UCX DEBUG using signal stack 0x7fbdeb05d000 size 148160 [1713684825.157998] [srv25-r203:480403:0] init.c:121 UCX DEBUG /home/alexm/environments/rhel-8.9/home/lib/libucs.so.0 loaded at...