Megatron-DeepSpeed
Megatron-DeepSpeed copied to clipboard
[MLM] Train script for non causal decoder