pytorch-multi-gpu-training 指出一点改进

指出一点改进

Open pengzhangzhi opened this issue 3 years ago • 1 comments

并行化模型的时候可以加上一步操作：

# Convert BatchNorm to SyncBatchNorm. 
net = nn.SyncBatchNorm.convert_sync_batchnorm(net)

确保batch norm 在所有process上sync了。

参考： https://theaisummer.com/distributed-training-pytorch/#step-1-initialize-the-distributed-learning-processes

Jun 15 '22 09:06 pengzhangzhi

好的，谢谢，我研究一下

Jul 19 '23 01:07 jia-zhuang