OpenNMT-py icon indicating copy to clipboard operation
OpenNMT-py copied to clipboard

Multi-node distributed training

Open szhang42 opened this issue 3 years ago • 0 comments

Hello,

I am working on the multi-node training for the Open NMT. I have two 4-GPU devices. If I want to train the Open NMT on these two nodes (each node with 4 GPU). Do I just set the world size as 8 and gpu_ranks as [0, 1, 2, 3, 4, 5, 6, 7]. Is there any other places I need to change for training on two nodes with 4 GPUs each? Thanks! @francoishernandez @guillaumekln

szhang42 avatar Aug 03 '21 01:08 szhang42