minimal-llama
minimal-llama copied to clipboard
How to correctly load and merge finetuned LLaMA models in different formats?
I am new to NLP and currently exploring the LLaMA model. I understand that there are different formats for this model - the original format and the Hugging Face format. I have fine-tuned the LLaMA model on my dataset using this tool https://github.com/lxe/llama-peft-tuner which is based on minimal-llama, and it saves the models in a certain way (see below):
$ ll llama-peft-tuner/models/csco-llama-7b-peft/
total 16456
drwxrwxr-x 8 lachlan lachlan 4096 May 10 10:42 ./
drwxrwxr-x 5 lachlan lachlan 4096 May 10 10:06 ../
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:21 checkpoint-1000/
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:28 checkpoint-1500/
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:35 checkpoint-2000/
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:42 checkpoint-2500/
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:13 checkpoint-500/
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:42 model-final/
-rw-rw-r-- 1 lachlan lachlan 16814911 May 10 10:42 params.p
$ ll llama-peft-tuner/models/csco-llama-7b-peft/checkpoint-2500/
total 7178936
drwxrwxr-x 2 lachlan lachlan 4096 May 10 10:42 ./
drwxrwxr-x 8 lachlan lachlan 4096 May 10 10:42 ../
-rw-rw-r-- 1 lachlan lachlan 33629893 May 10 10:42 optimizer.pt
-rw-rw-r-- 1 lachlan lachlan 7317523229 May 10 10:42 pytorch_model.bin
-rw-rw-r-- 1 lachlan lachlan 14575 May 10 10:42 rng_state.pth
-rw-rw-r-- 1 lachlan lachlan 557 May 10 10:42 scaler.pt
-rw-rw-r-- 1 lachlan lachlan 627 May 10 10:42 scheduler.pt
-rw-rw-r-- 1 lachlan lachlan 28855 May 10 10:42 trainer_state.json
-rw-rw-r-- 1 lachlan lachlan 3899 May 10 10:42 training_args.bin
I am not quite sure about the relationship between pytorch_model.bin
, the original model, and adapter_model.bin
. I suppose pytorch_model.bin
is in the Hugging Face format. Now, I want to create a .pth model that I can load in https://github.com/juncongmoo/pyllama/tree/main/apps/gradio.
I followed the manual conversion guide at https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/Manual-Conversion to convert the Hugging Face format into Hugging Face format (.bin) or PyTorch format (.pth). I tried treating pytorch_model.bin
as the Hugging Face format and modified the code to ignore the LoRA, but I couldn't achieve the desired result.The fine-tuning repository mentioned below provided a way to load the trained model by combining the original model and the learned parameters. I tried to adapt this approach into https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py and tried different combinations, but the result either doesn't incorporate the trained parameters or generates meaningless outputs.
Can someone help me understand how to correctly load and merge these models? Any help would be greatly appreciated. Thank you.