gemma.cpp Generate compressed weights file from finetune

How do i generate the compressed weights file (sbs) from my fine tune? Consider I want to convert the model assets to the compressed weights file: https://huggingface.co/google/gemma-2b-it/tree/main how would i do that?

Thanks!

Feb 22 '24 19:02 sanjay920

Hi @sanjay920, really cool that you're trying a fine tune already. We're working on releasing a conversion script soon (hopefully within the next few days), but would be useful to prioritize source formats. What are you converting from?

Also if others need a converter for fine tune, feel free to chine in in here as well regarding what you'd use as a source format.

Feb 22 '24 19:02 austinvhuang

Ideally from a PeftModel so I can convert like it's possible in llamacpp: https://github.com/ggerganov/llama.cpp/blob/master/convert-lora-to-ggml.py

Or if one merges the lora adapter with the base model - so a GemmaModel to sbs converter

Feb 22 '24 21:02 sanjay920

Hi @sanjay920 , a quick FYI on the implementation: Compressor in compression/compress-inl.h takes care of writing the SBS, so we've got that part covered. The missing bit is getting your model into our CompressedArray<>, which is the part Austin was mentioning and asking about.

Feb 23 '24 06:02 jan-wassenberg

I would like to convert a fine-tuned keras model to sbs, using the fine-tuning script from https://ai.google.dev/gemma/docs/lora_tuning

Feb 29 '24 01:02 fengwang

I would like to convert a fine-tuned keras model to sbs, using the fine-tuning script from https://ai.google.dev/gemma/docs/lora_tuning

Hi @fengwang , there is a way to export the Keras weights to PyTorch through this script (maybe needs a little modification to remove xla if you don't want to use it), and then convert the PyTorch weights to uncompressed weights of gemma.cpp through util/convert_weights.py.

But currently, this requires the dev branch because of the issues mentioned in #103. They were fixed in #114 and merged into the dev branch today.

Mar 22 '24 15:03 ufownl

I think this is now working, please feel free to reopen if you'd like to discuss or have an issue with the scripts.

Apr 18 '24 14:04 jan-wassenberg

gemma.cpp gemma.cpp copied to clipboard

Generate compressed weights file from finetune

gemma.cpp
gemma.cpp copied to clipboard