Compact-Transformers icon indicating copy to clipboard operation
Compact-Transformers copied to clipboard

Question about the batch size

Open imhgchoi opened this issue 2 years ago • 0 comments

Hi, this work is awesome. I just have one little question. The paper says the total batch size is 128 for CIFAR's and 4 GPU's were used in parallel. That doesn't mean the total batch size is 128 * 4 = 512, does it? DDP is for Imagenet, and non-distributed is for CIFAR, am I correct?

Thanks a ton :)

imhgchoi avatar Apr 14 '22 06:04 imhgchoi