SimCLR
SimCLR copied to clipboard
Issue regarding batch size
Hi, I have a question about batch size in Distributed Data Parallel. In my understanding, GPUs calculate each loss on their own nodes separately, so why 64*64 gpus can equal to 4096 batch size?