Issue regarding batch size

Open linghan1997 opened this issue 3 years ago • 0 comments

Hi, I have a question about batch size in Distributed Data Parallel. In my understanding, GPUs calculate each loss on their own nodes separately, so why 64*64 gpus can equal to 4096 batch size?

Jul 05 '22 14:07 linghan1997