Denis
Denis
The GPU used is AMD GPU MI 100 and the OS is Rocky Linux 8.8
I tried with GPU aware osu benchmarks, but i all works fine, so it seems to be related to AMReX code itself.
which command(s) should i use to get this info from UCX?
IOMMU is now enabled in both BIOS, Grub boot manager on 2 of our GPUs nodes. Unfortunately this new setup did not help. From the very first MPI inter-node communication...
with debug output: ``` 1705664408.009086] [lxbk1115:177189:0] wireup.c:1635 UCX DEBUG ep 0x7fade005d180: send wireup request (flags=0x40) [1705664408.009096] [lxbk1115:177189:a] ib_iface.c:797 UCX DEBUG iface 0x33635a0: ah_attr dlid=286 sl=0 port=1 src_path_bits=0 [1705664408.009102] [lxbk1115:177189:a] ud_ep.c:824...
@edgargabriel Answer from our sys. admin colleague: ``` The PCI ACS is disabled (see one of my previous email): lspci -vv -s 03:00.0|grep 'Access Control Services' -A 2 (...) It...
we do not use mellanox official MOFED but the linux rdma-core library
Not simple to repoduce but you can give a try: - ROCM 6.0 in /opt/rocm - UCX 1.15.0 (with ROCM) + openMPI 5.0.1 install in /usr/local - you can fetch/compile...
I can understand that ! If you need any help, do not hesitate to ask !
please try again with this new input file: [inputs_3d.txt](https://github.com/openucx/ucx/files/14028499/inputs_3d.txt) I commented out the HDF5 output.