tutel
tutel copied to clipboard
How to convert checkpoint files that adapt to different distributed world sizes
hi, i have tried your example to convert the swin_moe_small_patch4_window12_192_16expert_32gpu_22k。the first problem is the example format does not match the filesofswin_moe_small_patch4_window12_192_16expert_32gpu_22k, therefore,i have modified some code, however the example can only convert one rank.pth,not all rank.pth to one, can you show the correct example, I am puzzled by this question, thanks。
Just follow the instructions from https://github.com/microsoft/tutel/issues/248