ompi icon indicating copy to clipboard operation
ompi copied to clipboard

Performance Issues with MPIRun Due to Virtual Network Interfaces

Open GeofferyGeng opened this issue 9 months ago • 3 comments

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using?

v4.1.7rc1

Describe how Open MPI was installed

installed by MLNX_OFED

Please describe the system on which you are running

  • Operating system/version: Ubuntu 22.04
  • Computer hardware: Intel(R) Xeon(R) Platinum 8480+
  • Network type: Eth and mellanox

Details of the problem

I have a server with a single network card that has virtualized over 200 network interfaces. This causes significant delays when using mpirun, as the process hangs for a long time. I used UCX debug and found that the delays are primarily occurring on the bridged network interface.

Is there a solution for this issue? Any recommendations on how to optimize or configure the network interfaces to improve the performance of mpirun? Thank you!

ucx log

[1742871925.548610] [pod-hpc-02:1702645:0]       tcp_iface.c:945  UCX  DEBUG filtered out bridge device virbr0
[1742872077.918760] [pod-hpc-02:1702645:0]       tcp_iface.c:945  UCX  DEBUG filtered out bridge device wlan

GeofferyGeng avatar Mar 25 '25 03:03 GeofferyGeng

UCX_NET_DEVICES is your friend here. Set it to the interface you do intent to use, and this issue shall go away.

bosilca avatar Mar 25 '25 14:03 bosilca

UCX_NET_DEVICES is your friend here. Set it to the interface you do intent to use, and this issue shall go away.

Thanks for your reply!

Actually I have set the UCX_NET_DEVICES to another nic (tcp),but it seems it took effects after the following 2 line log. I'm so confused.

Finally I spent 4 times as long to complete my tests.

GeofferyGeng avatar Mar 26 '25 11:03 GeofferyGeng

I think I see the problem: uct_tcp_query_devices scans through all the interfaces, build a list of active and non-bridged interfaces and then trim it to the user requested devices. On a system with hundreds virtual interfaces, this is a very costly process as it involves many syscall for each interface.

This is not something we can fix in OMPI, it should be reported and addressed directly in UCX. @janjust @yosefe should be able to help.

bosilca avatar Mar 26 '25 14:03 bosilca