Less Wright

Results 81 comments of Less Wright

Hi @githubsgi - we have this one here: https://github.com/pytorch/torchtitan/blob/main/torchtitan/models/llama/train_configs/llama3_405b.toml Some of this will depend on how many gpus and what type of gpu given that memory will be a constraint.