nanoGPT
nanoGPT copied to clipboard
Multi GPUs training is very slow
I used 4 GPUs on 1 node:
torchrun --standalone --proc_per_node=4 train.py --compile=False
But, the training speed is just like 1 GPU,why?
Hello, have you solved it? I have the same problem as you
Check => https://pytorch.org/docs/stable/amp.html