mpich icon indicating copy to clipboard operation
mpich copied to clipboard

Hydra: Calls to MPI_Init{_thread} hang if any other process exits without calling init

Open jmcave opened this issue 3 years ago • 9 comments

Consider a program where one process calls init and another process exits without calling init:

$ cat mpitest.cpp

#include <cstdlib>
#include <iostream>
#include <string>
#include <mpi.h>

int getPmiRank() {
    const char* rankStr = std::getenv("PMI_RANK");
    return std::stoi(rankStr);
}

int main(int, char**) {
    const auto rank = getPmiRank();
    if (rank > 0) {
        std::cout << "About to throw 0" << std::endl;
        throw 0; //this can also be return 0
    }

    std::cout << "About to call MPI_Init" << std::endl;
    MPI_Init(nullptr, nullptr);
    std::cout << "MPI_Init finished" << std::endl;
    MPI_Finalize();
    return 0;
}

$ mpic++ mpitest.cpp -o mpitest $ mpiexec -np 2 -l ./mpitest

[0] About to call MPI_Init
[1] About to throw 0
[1] terminate called after throwing an instance of 'int'

The program hangs, with rank 0 waiting forever for the other process.

The behaviour is much nicer with the old SMPD PM: $ mpiexec -np 2 -l ./mpitest

[0]About to call MPI_Init
[1]About to throw 0
[1]terminate called after throwing an instance of '
int
'
[0][0] PMI_Init failed: FAIL - init called when another process has exited without calling init
[0]Fatal error in MPI_Init: Other MPI error, error stack:
[0]MPIR_Init_thread(433):
[0]MPID_Init(140).......: channel initialization failed
[0]MPID_Init(402).......: PMI_Init returned -1

job aborted:
rank: node: exit code[: error message]
0: <hostname>: 1: process 0 exited without calling finalize
1: <hostname>: -2

SMPD notices that some processes have called init while some have exited and cleans everything up. It would be great if Hydra was able to do this too. Not sure if this was a deliberate change in behaviour when moving to the new process manager. For programs you write yourself you can just make sure to always call init before doing anything else as the manpage suggests, though when wanting to run an executable you don't have full control over that's not always possible.

jmcave avatar Aug 16 '22 16:08 jmcave

Yeah, we can fix it. Thanks for reporting.

hzhou avatar Aug 16 '22 18:08 hzhou

Brilliant, thanks!

jmcave avatar Aug 17 '22 16:08 jmcave

Hmm, I couldn't reproduce the issue:

$ mpiexec -l -n 2 ./mpitest                             
[0] About to call MPI_Init                                                          
[1] About to throw 0                                                                
[1] terminate called after throwing an instance of 'int'                            
                                                                                    
=================================================================================== 
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES                            
=   PID 1323964 RUNNING AT tiger                                                    
=   EXIT CODE: 9                                                                    
=   CLEANING UP REMAINING PROCESSES                                                 
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES                                       
=================================================================================== 
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)            
This typically refers to a problem with your application.                           
Please see the FAQ page for debugging suggestions                                   

The error message is different from SMPD PM, which throws error in PMI_Init, but essentially similar behavior. I don't see it hangs.

hydra does not block in PMI_Init, but it will kill the other process if one of the process exit without going through PMI.

hzhou avatar Aug 18 '22 02:08 hzhou

Sorry, I'd been using mpich installed via apt, which was v3.3a2 from last year. I just tried again using v4.0.2 and get a similar result to you (except with exit code 6).

I do still get the hang with v4.0.2 if instead of throwing something I return 1. I think it makes sense for Hydra to treat that the same way.

jmcave avatar Aug 18 '22 16:08 jmcave

I confirm that if one process return we will hang. Let me see if I can fix that.

hzhou avatar Aug 18 '22 16:08 hzhou

The code in question is here - https://github.com/pmodels/mpich/blob/e09f4cd9df7a1f238f9eee17a5f0fb80bc16ea52/src/pm/hydra/proxy/pmip_cb.c#L308

If we remove the HYD_pmcd_pmip.downstream.pmi_fd_active[pid] condition, then the exit without PMI_Init will trigger the same cleanup behavior. We need discuss within the team on its implications

hzhou avatar Aug 18 '22 17:08 hzhou

In general we do not desire non-MPI process' exit to terminate other processes. It is common to use hydra to launch non-mpi processes and expect each process to proceed independently. However, we should be able to update "dead_processes" in this case and prevent the other MPI processes to hang.

hzhou avatar Aug 18 '22 17:08 hzhou

@jmcave I am about to mark this issue as "wontfix". Launch a set of processes with mixed MPI processes (i.e call MPI_Init) and non-MPI processes are incorrect usage. The MPI processes hang in init waiting for the rest of the processes to call MPI_Init is the correct behavior. This is different from some of the process abnormally exit before able to call MPI_Init. This works as shown in https://github.com/pmodels/mpich/issues/6124#issuecomment-1218915214. If the process simply return from main, then it is no longer a valid MPI code, since it exit normally without calling MPI_Init.

hzhou avatar Aug 25 '22 18:08 hzhou

I agree that mixed MPI and non-MPI processes as you described is an incorrect usage. It would be helpful though if mpiexec could identify that it's running invalid code and terminate the processes.

jmcave avatar Sep 01 '22 16:09 jmcave