tutel icon indicating copy to clipboard operation
tutel copied to clipboard

How to convert checkpoint files that adapt to different distributed world sizes

Open swjtulinxi opened this issue 1 year ago • 1 comments

hi, i have tried your example to convert the swin_moe_small_patch4_window12_192_16expert_32gpu_22k。the first problem is the example format does not match the filesofswin_moe_small_patch4_window12_192_16expert_32gpu_22k, therefore,i have modified some code, however the example can only convert one rank.pth,not all rank.pth to one, can you show the correct example, I am puzzled by this question, thanks。

swjtulinxi avatar Aug 27 '24 08:08 swjtulinxi

Just follow the instructions from https://github.com/microsoft/tutel/issues/248

ghostplant avatar Oct 29 '24 22:10 ghostplant