nanoGPT
nanoGPT copied to clipboard
16 GPU per node
Hi, my system has 16 GPUs per node. However, if I run
torchrun --standalone --nproc_per_node=16 train.py config/train_gpt2.py
The training crashed.
How can I use 16 GPUs?