CogVideo
CogVideo copied to clipboard
请问在sat下如何进行多机训练?需要修改哪些配置呢
`#! /bin/bash
echo "RUN on $(hostname), CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES"
run_cmd="PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True torchrun --standalone --nproc_per_node=8 train_video.py --base configs/test_cogvideox_5b.yaml configs/sft.yaml --seed $RANDOM"
echo ${run_cmd} eval ${run_cmd}
echo "DONE on hostname"`
启动命令应该怎样写呢?
sft.yaml 跟cogvideo5b.yaml里配置需要修改么?