yank icon indicating copy to clipboard operation
yank copied to clipboard

Parallel simulation using slurm queue

Open jslim-furame opened this issue 4 years ago • 3 comments

Let us assume that there are three nodes and each node has 8-GPUs. Is it possible to leverage 24 GPUs simultaneously if there are more than 24 lambda points? In Gromacs, it is possible because we can assign each GPU to each lambda point by creating Gromacs mdp files corresponding to respective lambda values. I examined Yank manual, but could not find any appropriate answer. Thanks in advance.

jslim-furame avatar Dec 04 '20 08:12 jslim-furame

Hi @jslim-furame . Yes, YANK will take care of parallelization if you are using replica exchange (which is the default). However, if the number of lambda points is not a multiple of 24, some of the GPUs will spend some time in idle in each iteration waiting for the simulations at all lambda points to catch up.

andrrizzi avatar Dec 10 '20 11:12 andrrizzi

Thanks. As before, let us assume that there are 3 nodes and each node has 8-GPUs and thus total 24 GPUs available. In addition, let us imagine there are 48 lambda points. Could you show me what the slurm script looks like in this situation? My point is whether it is possible to use multiple nodes (not just single node) using internode MPI communication.

jslim-furame avatar Dec 10 '20 12:12 jslim-furame

You can just launch YANK using the usual MPI executable and options supported by your cluster. YANK should be able to figure out whether it's been launched in MPI or serial mode by inspecting the MPI environment variables. The correct way of invoking an MPI job on SLURM depends on your system configuration so this is something that your system admin might be more helpful with.

andrrizzi avatar Dec 14 '20 18:12 andrrizzi