converting Gemma maxtext compatible checkpoint to Hugging Face format
I have looked around for a script that could convert MaxText Gemma and Gemma 2 checkpoints to Hugging Face format but i have not find anything related. This may related to https://github.com/google/maxtext/pull/581
Any update on this?
Indeed https://github.com/google/maxtext/pull/581 was adding support for this. Out of curiosity what is your use case for this?
Hi @gobbleturk , https://github.com/google/maxtext/pull/581 does not work with Gemma because Gemma 2 has local and global attention. I think each of q k and v attention layer has a local layer followed by global one. My case is that I did a continual pre-training of the Gemma 2 2B model on a mono language pre-training dataset and I want to use HF SFT trainer to do supervised fine-tuning tasks.
Thank you though for taking care of this.
@gobbleturk any update on this? Adding this feature would support further research on the Gemma model and enhance it in academic research for limited-resource languages.
From here you should find conversion script for gemma 2 https://github.com/AI-Hypercomputer/maxtext/issues/1324