JetMoE
JetMoE copied to clipboard
What is the minimum GPU configurations for training?
What is the minimum V100 or A800 numbers we need to train this model if not considering the training time ?