bhendersonPlano
bhendersonPlano
[bad_run_output.txt](https://github.com/open-mpi/ompi/files/15029514/bad_run_output.txt)
Sorry for being confusing - will try a different approach next time. Looks like it is using the same pml on both processes. ``` $ srun --reservation=brent -N 2 -n...
Yep, we have 'MpiDefault=pmix' in the slurm.conf file so I didn't need it on the srun command line. Good suggestion about the slurm 23.11.x build. I may drop back a...
Thanks Ralph - I backed down to slurm 23.02.7 (with the same hwloc and pmix) and saw the same issue. My next experiment is to back down PMIx from 5.0.2...
no problem - thanks for the suggestions and patch. I've gotten through the first two tests on my list and some good news. Backing down to pmix 4.2.9 and rebuilding...
looks like your suspicion was correct on receiving a non-string object: ``` [cn04] check:select: got a non-string (PMIX_BYTE_OBJECT) pml from process [[35399,0],0] ``` Full file here - [bad_run_output_patched.txt](https://github.com/open-mpi/ompi/files/15043395/bad_run_output_patched.txt) let me...
I applied the patch and rebuilt PMIx. Next, I rebuilt slurm and openmpi pointing to that new build. Good news is that everything worked. I kind of wish I had...