ucx icon indicating copy to clipboard operation
ucx copied to clipboard

ucp_mem_map registers only a few GB per second

Open thomas3494 opened this issue 7 months ago • 0 comments

The issue

Registering memory for remote access using ucp_mem_map has performance in range of 6-13 GB/s.

size in gb, bandwidth gb/s 5.000000, 8.409936 10.000000, 13.216588 20.000000, 13.187367 40.000000, 7.561953 80.000000, 6.037441

Steps to Reproduce

Modify mem_map_bench.sh in https://github.com/thomas3494/BSPonUCX to match your SLURM setup, and run sbatch mem_map_bench.sh. This will print two tables (one for each node) with bandwidth results in GB/s for various sizes (5 - 80 GB by default) in the file mem_map_bench.out.

UCX version and configuration:

Library version: 1.16.0 Library path: /sw/arch/RHEL9/EB_production/2024/software/UCX/1.16.0-GCCcore-13.3.0/lib/libucs.so.0 API headers version: 1.16.0 Git branch '', revision e4bb802 Configured with: --prefix=/sw/arch/RHEL9/EB_production/2024/software/UCX/1.16.0-GCCcore-13.3.0 --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --enable-optimizations --enable-cma --enable-mt --with-verbs --without-java --without-go --disable-doxygen-doc --disable-logging --disable-debug --disable-assertions --disable-params-check

The environment is in ucx_environment.txt.

Setup and versions

OS is Redhat 9.4 Kernel 5.14.0-427.42.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 18 14:35:40 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux

IB information in ib_devinfo.txt

Additional information

  • OpenMPI version: 5.0.3 (only used for OOB connection)
  • Transport and devices in ucx_info.txt.
  • Log file - log.txt.

thomas3494 avatar Jun 06 '25 09:06 thomas3494