Megatron-DeepSpeed
Megatron-DeepSpeed copied to clipboard
DeepSpeedCheckpoint needs to support bf16 optimizer states.
https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/a72225908e9bbda4d989bcdecd71c3c4a05a7f71/tools/convert_checkpoint/deepspeed_checkpoint.py#L5 seems wrong since the files generated using bf16 have bf16_zero_pp_rank as prefix.