IJCAI2023-OptimalShardedDataParallel
IJCAI2023-OptimalShardedDataParallel copied to clipboard
pytorch version
Hello! I have obtained a ViT model from timm, and I want to train it using your OSDP method. However, OSDP requires torch version 1.10.2, while timm needs a higher version. What should I do in this situation?
Actually we have merged the function of OSDP to Galvatron from Hetu, also a high performance training framework, you can check the newest release version Galvatron-2 for optimized implementation. https://github.com/PKU-DAIR/Hetu-Galvatron
Actually we have merged the function of OSDP to Galvatron from Hetu, also a high performance training framework, you can check the newest release version Galvatron-2 for optimized implementation. https://github.com/PKU-DAIR/Hetu-Galvatron
Thank you!