ucx icon indicating copy to clipboard operation
ucx copied to clipboard

Open UCX 1.15.0 with Open MPI 4.1.1 - running osu_iallgather/osu_iallgatherv stucked when the message size reached 65536

Open Tobez123 opened this issue 1 year ago • 6 comments

Describe the bug

We use Open UCX 1.15.0 with Open MPI 4.1.1 to run osu_iallgather/osu_iallgatherv. However, when the message size reached 65536, the program was stucked, we waited at least 30 minutes but printed nothing no more.

Things we have tried

  • add `-x UCX_RC_MLX5_RX_QUEUE_LEN=8191', it works!
  • add '-x UCX_RNDV_THRESH=8192', it also works!

Steps to Reproduce

  • Command line mpirun -x UCX_TLS=sm,rc_x -x UCX_NET_DEVICES=mlx5_1:1 -np 1024 -N 128 --hostfile hostfile_path -mca pml ucx -mca btl ^vader,tcp,openib,uct osu_iallgather -i 2

  • UCX version used :1.15.0

  • UCX configure flags (can be checked by ucx_info -v)

Library version: 1.15.0 Library path: /lib/libucs.so.0 API headers version: 1.15.0 Git branch '', revision Configured with: --disable-logging --disable-debug --disable-assertions --disable-params-check --enable-optimizations --prefix=/openucx --enable-mt

  • Any UCX environment variables used
    • UCX_TLS=sm,rc_x
    • UCX_NET_DEVICES=mlx5_1:1

Setup and versions

  • OS version (e.g Linux distro)
    • Linux 6426-node125 4.19.90-2112.8.0.0131.oe1.aarch64 #1 SMP Fri Dec 31 19:53:20 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
  • CPU architecture (x86_64/aarch64/ppc64le/...)
    • aarch64
  • For RDMA/IB/RoCE related issues:
    • Driver version:
      • rdma-core-54mlnx1-1.54303.aarch64
      • MLNX_OFED_LINUX-5.4-3.0.3.0
    • HW information from ibstat or ibv_devinfo -vv command

CA 'mlx5_1' CA type: MT4121 Number of ports: 1 Firmware version: 16.31.2006 Hardware version: 0 Node GUID: 0x98039b030071f6e9 System image GUID: 0x98039b030071f6e8 Port 1: State: Active Physical state: LinkUp Rate: 100 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x00010000 Port GUID: 0x9a039bfffe71f6e9 Link layer: Ethernet

Additional information (depending on the issue)

  • OpenMPI version
    • Open MPI 4.1.1
  • OSU version
    • osu-micro-benchmarks-7.1-1
  • Output log iallgather iallgatherv

Tobez123 avatar Mar 07 '24 08:03 Tobez123

osu_iallgatherv add -x UCX_RC_MLX5_RX_QUEUE_LEN=8191 add UCX_RC_MLX5_RX_QUEUE_LEN

Tobez123 avatar Mar 07 '24 09:03 Tobez123

osu_iallgatherv add -x UCX_RNDV_THRESH=8192 add UCX_RNDV_THRESH

Tobez123 avatar Mar 07 '24 09:03 Tobez123

osu_iallgather add -x UCX_RC_MLX5_RX_QUEUE_LEN=8191 iallgather add UCX_RC_MLX5_RX_QUEUE_LEN

Tobez123 avatar Mar 07 '24 12:03 Tobez123

osu_iallgather add -x UCX_RNDV_THRESH=8192 iallgather add UCX_RNDV_THRESH

Tobez123 avatar Mar 07 '24 12:03 Tobez123

Hi,

I noticed that when you set UCX_RNDV_THRESH=8192, you didn't set UCX_TLS=sm,rc_x. I guess that in the case of UCX_RNDV_THRESH=8192, the reason was the use of different transport by the UCX.

Does the program stuck if the command line contains UCX_TLS=sm,rc_x along with UCX_RNDV_THRESH=8192?

mpirun -x UCX_RNDV_THRESH=8192 -x UCX_TLS=sm,rc_x -x UCX_NET_DEVICES=mlx5_1:1 -np 1024 -N 128 --hostfile hostfile_path -mca pml ucx -mca btl ^vader,tcp,openib,uct osu_iallgather -i 2

Does the program stuck if the command line doesn't contain UCX_TLS=sm,rc_x?

mpirun -x UCX_NET_DEVICES=mlx5_1:1 -np 1024 -N 128 --hostfile hostfile_path -mca pml ucx -mca btl ^vader,tcp,openib,uct osu_iallgather -i 2

Does the program stuck if the command line contains UCX_TLS=sm,rc_x,dc?

mpirun -x UCX_TLS=sm,rc_x,dc -x UCX_NET_DEVICES=mlx5_1:1 -np 1024 -N 128 --hostfile hostfile_path -mca pml ucx -mca btl ^vader,tcp,openib,uct osu_iallgather -i 2

rakhmets avatar Mar 08 '24 13:03 rakhmets

Thanks for your reply! Following screenshots are the results I have tried.

  1. contains UCX_TLS=sm,rc_x along with UCX_RNDV_THRESH=8192 iallgather+UCX_RNDV_THRESH
  2. doesn't contain UCX_TLS=sm,rc_x iallgather
  3. contains UCX_TLS=sm,rc_x,dc iallgather+UCX_TLS sm_rc_x_dc

Tobez123 avatar Mar 13 '24 11:03 Tobez123