Megatron-DeepSpeed
Megatron-DeepSpeed copied to clipboard
how to convert huggingface model to megatron-deepspeed?
as title said.
This is not possible. To download DS checkpoints refer to this issue: https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/319
This is not possible. To download DS checkpoints refer to this issue: #319
why? So I have to training from the ground up? It's hard.
I don't understand the issue. Do you just need to run inference? If that is the case, DS-inference is compatible with all Huggingface models.
I don't understand the issue. Do you just need to run inference? If that is the case, DS-inference is compatible with all Huggingface models.
Hello, I have the same problem.
I want to load the model on Huggingface as a pre-training model weight and continue the training using the Megatron Deepspeed framework.
But I found that I didn't know how to convert the weight of Huggingface into the weight of Megatron Deepspeed.
I look forward to your help. Thank you.
By the way:
model structure: gpt model link: https://huggingface.co/TsinghuaAI/CPM-Generate
I want to train the model with 4 pipeline parallel and deepspeed.
@AnShengqiang Its non-trivial to convert models for training. People are actively exploring this as far as I know. This repository saves something called a universal checkpoint which can be converted to other checkpoints. However, I am quite new here so, I don't really know how that works.
@AnShengqiang Its non-trivial to convert models for training. People are actively exploring this as far as I know. This repository saves something called a universal checkpoint which can be converted to other checkpoints. However, I am quite new here so, I don't really know how that works.
Thank you for your reply, I will go to find the answer, if there is good news, I will put it here.