maxtext icon indicating copy to clipboard operation
maxtext copied to clipboard

converting Gemma maxtext compatible checkpoint to Hugging Face format

Open salrowili opened this issue 1 year ago • 5 comments

I have looked around for a script that could convert MaxText Gemma and Gemma 2 checkpoints to Hugging Face format but i have not find anything related. This may related to https://github.com/google/maxtext/pull/581

salrowili avatar Aug 16 '24 12:08 salrowili

Any update on this?

salrowili avatar Aug 21 '24 12:08 salrowili

Indeed https://github.com/google/maxtext/pull/581 was adding support for this. Out of curiosity what is your use case for this?

gobbleturk avatar Sep 17 '24 18:09 gobbleturk

Hi @gobbleturk , https://github.com/google/maxtext/pull/581 does not work with Gemma because Gemma 2 has local and global attention. I think each of q k and v attention layer has a local layer followed by global one. My case is that I did a continual pre-training of the Gemma 2 2B model on a mono language pre-training dataset and I want to use HF SFT trainer to do supervised fine-tuning tasks.

Thank you though for taking care of this.

salrowili avatar Sep 18 '24 04:09 salrowili

@gobbleturk any update on this? Adding this feature would support further research on the Gemma model and enhance it in academic research for limited-resource languages.

salrowili avatar Oct 12 '24 09:10 salrowili

From here you should find conversion script for gemma 2 https://github.com/AI-Hypercomputer/maxtext/issues/1324

R4ZZ3 avatar May 08 '25 13:05 R4ZZ3