Hui Zhou

Results 695 comments of Hui Zhou

Set `MPIR_CVAR_CH4_OFI_EAGER_THRESHOLD=1000000` to make all messages 1MB above to use the new ofi RNDV path. There are 4 RNDV protocols: 1. pipeline - send packs to genq chunks and send...

On sunspot the branch (`aurora_test`) has been loaded as default, so he can directly test it there. @servesh is building the module for aurora as well. I think once he...

@tommusta @garzaran , could you run a test with just 2 processes (i.e. `osu_bw`)? I am also not getting the same numbers I got as in https://github.com/pmodels/mpich/pull/7529#issuecomment-3161832162. I am getting...

For reference, here is what I get with `mpiexec -cpu-bind list:2 -n 2 ./tmusta/osu_mbw_mr -m 1:16777216 -d sycl D D` | Size (bytes) | MB/s | Messages/s | |--------------:|----------:|-------------:| |...

I was using the latest `main` branch. Now switching to the commit 06f12a, I got my original bandwidth back (24GB/sec at very large message size). Here is the data with...

I am getting the data with the bench tests in MPICH's testsuite. The binaries are here: ``` -rwxr-xr-x 1 hzhou users 152544 Nov 5 20:00 /home/hzhou/pull_requests/mpich7536/test/mpi/bench/p2p_bw -rwxr-xr-x 1 hzhou users...

I acknowledge there are some latency issues at smaller message sizes, which I am trying to trouble shoot.

test:mpich/ch3/most test:mpich/ch4/most ✔️

I can't reproduce it. I tried both the latest `aurora_test` branch and an old `aurora` branch (last commit 12/21/2024). Both didn't hang but I notice some how we significantly improved...

I suspect this is the same network jam issue.