nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

Cannot run train.py on Multiple Gpus

Open srivassid opened this issue 1 year ago • 1 comments

Hi

I can run train.py on a single gpu, but not on more than one. I have 2 gpus, and if i run the command

torchrun --standalone --nproc_per_node=2 train.py config/train_shakespear_char.py

i get an error RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

Can anyone help me out?

Thanks

srivassid avatar Jun 03 '23 08:06 srivassid