DeepSpeedExamples
DeepSpeedExamples copied to clipboard
How to use Deepspeed checkpoints for BingBertSquad finetuning?
Hi there, The tutorial https://www.deepspeed.ai/tutorials/bert-finetuning/#loading-huggingface-and-tensorflow-pretrained-models makes clear how to load HF and TF checkpoints into Deepspeed. What if we want to load a Deepspeed checkpoint, like from the Bing BERT example? Is it that we load the "mp_rank_00_model_states.pt" file in the checkpoint?
I'm currently using fp16 and ZERO-2, so I wonder if using that will lose some precision. Should I use zero_to_fp32 to convert the checkpoint to fp32 for loading?
Hi there, The tutorial https://www.deepspeed.ai/tutorials/bert-finetuning/#loading-huggingface-and-tensorflow-pretrained-models makes clear how to load HF and TF checkpoints into Deepspeed. What if we want to load a Deepspeed checkpoint, like from the Bing BERT example? Is it that we load the "mp_rank_00_model_states.pt" file in the checkpoint?
I'm currently using fp16 and ZERO-2, so I wonder if using that will lose some precision. Should I use zero_to_fp32 to convert the checkpoint to fp32 for loading?
Is there an answer to this question? Thankyou
@xycforgithub and @1024er, apologies for the late response.
Yes, please use zero_to_fp32 to convert deepspeed checkpoint to a standard pytorch fp32 weights checkpoint.