Yanfei Guo
Yanfei Guo
Tests are clean. Rebase to main and remove the irrelevant commit.
test: mpich/ch4/ofi
Action items: - [x] 1. Skip AVX512F runtime test if configured with `--enable-fast=avx512f` explicitly, doing compile-time check to make sure compiler can generate the code. - [x] 2. Default behavior...
Just to follow up on this. The current supported option is `--enable-fast=O3,alwaysinline,avx,avx512f,ndebug`. The `avx` enables AVX2 and everything below it. The `avx512f` enables AVX512F and everything below. The configure now...
I got some unexpected result. I was testing with test/mpi/bench/p2p_bw and it was improving the throught with send/recv loops set to 8. But, when I try OSU_BW, the results are...
I also changed the default of the CVARs to 1. This way, the multi-polling becomes a opt-in feature.
The usefulness of the patch need more investigation.
ibv_reg_mr failed. For some reason UCX think the address is host, not rocm.
I did a quick test with `UCX_MEMTYPE_CACHE=no`, still the same issue. The problem only happens in rndv path of the IB communication. Keep digging.