Megatron-DeepSpeed
Megatron-DeepSpeed copied to clipboard
[wip] debug with new data
working on debugging on a live checkpoint (with optim states) but with a small custom dataset.