faster-pytorch-blog
faster-pytorch-blog copied to clipboard
python 7_fabric.py no speedup (4 RTX 3090)

Thanks~
Hm, what's your baseline speed on a single GPU? And how many workers are you using in the dataloader?
On a single GPU, the time is ~ 22 min.
By the way, I git clone your code without changing a single line of code.
Thanks
I see, that's weird. So basically you only get 18 min for 4 GPUs, where you get 22 min on a single GPU? That's definitely weird, I don't think I have a good explanation for this. Maybe one GPU was busy running something else at the same time, and it slowed down everything because the other GPUs had to wait for the sync step?