torchtitan
torchtitan copied to clipboard
Is a PP+FSDP+TP config + toml available for pre-training 405B model ?
Would appreciate if someone can share a toml file to do PP+FSDP+TP for 405B model.
Hi @githubsgi - we have this one here: https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/train_configs/llama3_405b.toml Some of this will depend on how many gpus and what type of gpu given that memory will be a constraint.
Thanks, I am familiar with that , where PP is set to 1. All my attempts at setting PP> 1 failed . Does the automatic slicing of layers work wit the 405B model ?
It should work. E.g. see Table 5 in https://github.com/pytorch/torchtitan/blob/main/docs/performance.md Could you provide a detailed bug report so we can help?