Wenduo Wang

Results 178 comments of Wenduo Wang

I also had the concern of breaking API compatibility but none of our CI(internal and gh action) actually caught this so that's interesting. I was reading the code and saw...

Thanks for the PR. I'm running AWS CI.

This PR failed AWS internal CI. Seeing a lot of failures ``` mpirun --wdir . -n 72 --hostfile hostfile --map-by ppr:36:node --timeout 1800 -x PATH mpi-benchmarks-IMB-v2021.7/IMB-MPI1 Scatterv -npmin 72 -iter...

I ran our tests again with many failures as shown above. I haven't got a chance to look into that yet. A quick glance shows `--enable-debug` fixes those failures.

Finally I got some time to look into this. The issue happens on 2 nodes during MPI_Init ``` ... [ip-172-31-4-77.us-west-2.compute.internal:26954] mca: base: components_open: component tcp open function successful [ip-172-31-4-77.us-west-2.compute.internal:26954] select:...

Unfortunately the same tests are still failing...

@rhc54 In our CI we build applications separately for mpi with debug vs non-debug so this shouldn't be an issue. @hppritcha I wonder if someone on your side could quickly...