mpich
mpich copied to clipboard
Hydra: Calls to MPI_Init{_thread} hang if any other process exits without calling init
Consider a program where one process calls init and another process exits without calling init:
$ cat mpitest.cpp
#include <cstdlib>
#include <iostream>
#include <string>
#include <mpi.h>
int getPmiRank() {
const char* rankStr = std::getenv("PMI_RANK");
return std::stoi(rankStr);
}
int main(int, char**) {
const auto rank = getPmiRank();
if (rank > 0) {
std::cout << "About to throw 0" << std::endl;
throw 0; //this can also be return 0
}
std::cout << "About to call MPI_Init" << std::endl;
MPI_Init(nullptr, nullptr);
std::cout << "MPI_Init finished" << std::endl;
MPI_Finalize();
return 0;
}
$ mpic++ mpitest.cpp -o mpitest
$ mpiexec -np 2 -l ./mpitest
[0] About to call MPI_Init
[1] About to throw 0
[1] terminate called after throwing an instance of 'int'
The program hangs, with rank 0 waiting forever for the other process.
The behaviour is much nicer with the old SMPD PM:
$ mpiexec -np 2 -l ./mpitest
[0]About to call MPI_Init
[1]About to throw 0
[1]terminate called after throwing an instance of '
int
'
[0][0] PMI_Init failed: FAIL - init called when another process has exited without calling init
[0]Fatal error in MPI_Init: Other MPI error, error stack:
[0]MPIR_Init_thread(433):
[0]MPID_Init(140).......: channel initialization failed
[0]MPID_Init(402).......: PMI_Init returned -1
job aborted:
rank: node: exit code[: error message]
0: <hostname>: 1: process 0 exited without calling finalize
1: <hostname>: -2
SMPD notices that some processes have called init while some have exited and cleans everything up. It would be great if Hydra was able to do this too. Not sure if this was a deliberate change in behaviour when moving to the new process manager. For programs you write yourself you can just make sure to always call init before doing anything else as the manpage suggests, though when wanting to run an executable you don't have full control over that's not always possible.
Yeah, we can fix it. Thanks for reporting.
Brilliant, thanks!
Hmm, I couldn't reproduce the issue:
$ mpiexec -l -n 2 ./mpitest
[0] About to call MPI_Init
[1] About to throw 0
[1] terminate called after throwing an instance of 'int'
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 1323964 RUNNING AT tiger
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
The error message is different from SMPD PM, which throws error in PMI_Init, but essentially similar behavior. I don't see it hangs.
hydra does not block in PMI_Init, but it will kill the other process if one of the process exit without going through PMI.
Sorry, I'd been using mpich installed via apt, which was v3.3a2 from last year. I just tried again using v4.0.2 and get a similar result to you (except with exit code 6).
I do still get the hang with v4.0.2 if instead of throwing something I return 1. I think it makes sense for Hydra to treat that the same way.
I confirm that if one process return we will hang. Let me see if I can fix that.
The code in question is here - https://github.com/pmodels/mpich/blob/e09f4cd9df7a1f238f9eee17a5f0fb80bc16ea52/src/pm/hydra/proxy/pmip_cb.c#L308
If we remove the HYD_pmcd_pmip.downstream.pmi_fd_active[pid] condition, then the exit without PMI_Init will trigger the same cleanup behavior. We need discuss within the team on its implications
In general we do not desire non-MPI process' exit to terminate other processes. It is common to use hydra to launch non-mpi processes and expect each process to proceed independently. However, we should be able to update "dead_processes" in this case and prevent the other MPI processes to hang.
@jmcave I am about to mark this issue as "wontfix". Launch a set of processes with mixed MPI processes (i.e call MPI_Init) and non-MPI processes are incorrect usage. The MPI processes hang in init waiting for the rest of the processes to call MPI_Init is the correct behavior. This is different from some of the process abnormally exit before able to call MPI_Init. This works as shown in https://github.com/pmodels/mpich/issues/6124#issuecomment-1218915214. If the process simply return from main, then it is no longer a valid MPI code, since it exit normally without calling MPI_Init.
I agree that mixed MPI and non-MPI processes as you described is an incorrect usage. It would be helpful though if mpiexec could identify that it's running invalid code and terminate the processes.