Swin-Transformer icon indicating copy to clipboard operation
Swin-Transformer copied to clipboard

About nsys profiing analysis using Swin-moe

Open QDDse opened this issue 3 years ago • 1 comments

I just modified some codes about making some fake inputs to train swin-moe, and exported a nsys profiling. What confused me is that why there are 7 Allreduce in backward per step, is there somebody tell me why? Thx very much! image

QDDse avatar Oct 26 '22 09:10 QDDse

I used 8 experts and 8 gpus on one node to train this.

QDDse avatar Oct 26 '22 09:10 QDDse