When can we have a the MOE checkpoint convert script.
As mentioned here, having a proper MOE/Mixtral checkpoint converter script will help us to fine-tune Mixtral.
+1
I also strongly need this tool
https://github.com/NVIDIA/Megatron-LM/issues/756#issuecomment-2126186633
Here is some information about converting Huggingface checkpoints to Nemo. It seems there is a conversion script available on GitHub. Although I haven't confirmed it, it might be useful. https://medium.com/karakuri/train-moes-on-aws-trainium-a0ebb599fbda and https://github.com/abeja-inc/Megatron-LM
Sorry for the late response, please check: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/mixtral
Sorry for the late response, please check: https://github.com/NVIDIA/Megatron-LM/tree/main/examples/mixtral
Does it support megatron_to_transformers? I am currently using https://github.com/alibaba/Pai-Megatron-Patch/blob/main/toolkits/model_checkpoints_convertor/mistral/hf2mcore.py
Marking as stale. No activity in 60 days.