pytorch-multi-gpu-training
pytorch-multi-gpu-training copied to clipboard
指出一点改进
并行化模型的时候可以加上一步操作:
# Convert BatchNorm to SyncBatchNorm.
net = nn.SyncBatchNorm.convert_sync_batchnorm(net)
确保batch norm 在所有process上sync了。
参考: https://theaisummer.com/distributed-training-pytorch/#step-1-initialize-the-distributed-learning-processes
好的,谢谢,我研究一下