Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

When can we have a the MOE checkpoint convert script.

Open shamanez opened this issue 1 year ago • 6 comments

As mentioned here, having a proper MOE/Mixtral checkpoint converter script will help us to fine-tune Mixtral.

shamanez avatar Apr 22 '24 11:04 shamanez

+1

yqli2420 avatar Apr 23 '24 10:04 yqli2420

I also strongly need this tool

https://github.com/NVIDIA/Megatron-LM/issues/756#issuecomment-2126186633

hwdef avatar May 23 '24 04:05 hwdef

Here is some information about converting Huggingface checkpoints to Nemo. It seems there is a conversion script available on GitHub. Although I haven't confirmed it, it might be useful. https://medium.com/karakuri/train-moes-on-aws-trainium-a0ebb599fbda and https://github.com/abeja-inc/Megatron-LM

oecompmind avatar Jun 14 '24 04:06 oecompmind

Sorry for the late response, please check: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/mixtral

yanring avatar Jul 16 '24 13:07 yanring

Sorry for the late response, please check: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/mixtral

Does it support megatron_to_transformers? I am currently using https://github.com/alibaba/Pai-Megatron-Patch/blob/main/toolkits/model_checkpoints_convertor/mistral/hf2mcore.py

yqli2420 avatar Jul 18 '24 02:07 yqli2420

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Sep 16 '24 18:09 github-actions[bot]