hermes
hermes copied to clipboard
Minimize code changes when converting MPI app to native Hermes app
Current changes required to convert an MPI app to a native Hermes app
- Request
MPI_THREAD_MULTIPLEsupport. This is only required if the app isn't already doing it, and it's possible thatMPI_THREAD_FUNNELEDwould be sufficient (this will be determined once we're ready to release version 1.0). - Initialize Hermes: `std::shared_ptrhapi::Hermes hermes = hapi::InitHermes(conf);
- Acquire a special Hermes
MPI_Command use it anywhere the app would have usedMPI_COMM_WORLD. Example:
MPI_Comm *app_comm = (MPI_Comm *)hermes->GetAppCommunicator();
MPI_Bcast(..., *app_comm);
- Put the application code into a
if (hermes->IsApplicationCore()) {...}block. - Finalize Hermes (outside the
ifblock of4):hermes->Finalize() - Run
mpirunwith one extra process per node than you would normally run.
# Normal single node run
mpirun -n 4 a.out
# Hermes single node run
mpirun -n 5 a.out
# Normal multi-node run
mpirun -n 4 -ppn 2 -hosts node1,node2 a.out
# Hermes multi-node run
mpirun -n 6 -ppn 3 -hosts node1,node2 a.out
Steps to eliminate the code change requirements
- Eliminate
1and2by interceptingMPI_Init/MPI_Init_threadand requesting the appropriate thread support, then callingInitHermesand storing the result in a global (singleton). - Eliminate
3by intercepting every MPI function that takes anMPI_Commargument and applying the following algorithm:
if communicator == MPI_COMM_WORLD:
replace communicator with result of hermes->GetAppCommunicator()
- Eliminate
5by interceptingMPI_Finalize. - Eliminate
4by running the app in MPMD style, where we run an instance of a specialhermes_coreprogram on each node that looks like this:
namespace hapi = hermes::api;
int main() {
MPI_Init_thread();
std::shared_ptr<hapi::Hermes> hermes = hapi::InitHermes(conf);
hermes->Finalize();
MPI_Finalize();
return 0;
The mpirun commands would now look like this:
# Normal single node run
mpirun -n 4 a.out
# Hermes single node run
mpirun -n 1 hermes_core : -n 4 a.out
# Normal multi-node run
mpirun -n 4 -ppn 2 -hosts node1,node2 a.out
# Hermes multi-node run
mpirun -n 2 -ppn 1 -hosts node1,node2 hermes_core : -n 4 -ppn 2 a.out
We could take this one step further and eliminate the need to intercept MPI calls that take a communicator (i.e., let the app use MPI_COMM_WORLD as normal) by running the hermes_core program as a daemon. Instead of InitHermes() it would call InitHermesDaemon() In that case the launch commands would be:
# Single node
mpirun -n 1 hermes_core &
mpirun -n 4 a.out
# Multi-node
mpirun -n 2 -ppn 1 -hosts node1,node2 hermes_core &
mpirun -n 4 -ppn 2 -hosts node1,node2 a.out
Doing this would require 2 additional changes to the library:
-
Hermes::Finalize()would have to callRemoteFinalize()on the app ranks to shutdown the daemon (or force the user to shut it down explicitly. Maybe ahermes upandhermes downcli program). - The app ranks would have to have a way to make sure the Hermes core is initialized before they begin executing. They would presumably loop on an
IsHermesInitializedRPC.