improved-diffusion icon indicating copy to clipboard operation
improved-diffusion copied to clipboard

how to train the model use multiple machines through mpiexec

Open wangsaisai716 opened this issue 2 years ago • 0 comments

hi, thx to your great repo. I have a question about how to train a model use mpiexec I searched some infos from the Internet that https://gitee.com/mindspore/docs/blob/r1.2/tutorials/training/source_zh_cn/advanced_use/distributed_training_gpu.md. mpirun -n 16 --hosts DEVICE1_IP:8,DEVICE2_IP:8 -x DATA_PATH=$DATA_PATH. However, it doesn't work.

wangsaisai716 avatar Jan 10 '23 06:01 wangsaisai716