Martin Vejvar

Results 1 comments of Martin Vejvar

You might try migrating to torchrun? i.e.: ``` torchrun --nproc_per_node 2 examples/pytorch/language-modeling/run_clm.py \ --model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \ --do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200 ``` for reference...