Megatron-LM
Megatron-LM copied to clipboard
Distributed Mamba Training
How to customise the train.sh for a distributed Mamba Training ?
Hello, As i've seen in the megatron modules, there isn't a pre-defined bash script to pre-train a mamba model on multi-gpu, how can i set it up for model / data parallelism ...