MPIClusterManagers.jl
MPIClusterManagers.jl copied to clipboard
Cleaner cluster termination when using MPIManager
A few related issues:
- warnings printed when using MPI transport
- different finalization procedures between when using MPI_ON_WORKERS, MPI_TRANSPORT_ALL, and TCP_TRANSPORT_ALL. Implement a standard "close" function.
- warning printed by Julia event loop when using MPI_ON_WORKERS
I noticed similar issue on the cray system (I assume that the problems above are observable on any system but I can provide more info and/or test any fixes if that'd be helpful). Parallel jobs terminate with the following:
WARNING: Forcibly interrupting busy workers
WARNING: Unable to terminate all workers