LLaVA [Question] how to merge the middle checkpoint file with lora

Question

i want to test the checkpoint-5000 in lora，when i ran python scrips/merge_lora_weights.py --model-path ./checkpoints/llava-v1.5-13b-lora --model-base lmsys/vicuna-13b-v1.5 --save-model-path ./checkpoints/merge it went wrong

Dec 15 '23 01:12 terminator123

you need to copy the config.json and non_lora_trainables.bin into your checkpoint-5000 folder

Dec 29 '23 09:12 Isaachhh

I also have the same problem #1194. Did you solve it?

Feb 28 '24 06:02 charismaticchiu

you need to copy the config.json and non_lora_trainables.bin into your checkpoint-5000 folder Is config.json and non_lora_trainable.bin saved only at the end of the entire training? I have set epoch 10, can I copy these two files from epoch 10 directly to the first nine?

Apr 20 '24 13:04 wuwu-C

Is config.json and non_lora_trainable.bin saved only at the end of the entire training?

I think so.

I have set epoch 10, can I copy these two files from epoch 10 directly to the first nine?

The weights of projector are saved in non_lora_trainables.bin, which is unfrozen during sft stage.

Apr 20 '24 14:04 Isaachhh

Thank you for your reply！but I also have some question

The weights of projector are saved in non_lora_trainables.bin, which is unfrozen during sft stage.

non_lora_trainable.bin is not storing the weight without lora trimming part, shouldn't it be frozen? Why is it a weight store for projectors?
In your previous answer, you said copy the two files to the corresponding weight folder.If it is unfrozen during sft stage, this way is incorrect.How can I merge the middle checkpoint file with lora. Can you give me more detailed explanation,thank you!

Apr 21 '24 09:04 wuwu-C

Thank you for your reply！but I also have some question

The weights of projector are saved in non_lora_trainables.bin, which is unfrozen during sft stage.

non_lora_trainable.bin is not storing the weight without lora trimming part, shouldn't it be frozen? Why is it a weight store for projectors?

In your previous answer, you said copy the two files to the corresponding weight folder.If it is unfrozen during sft stage, this way is incorrect.How can I merge the middle checkpoint file with lora. Can you give me more detailed explanation,thank you!

non_lora_trainable, non_lora and trainable, so it stores projector because it's trained directly other than lora. Check here Try: a = torch.load('.../non_lora_trainables.bin') print(a.keys())
Yes, you are right. And you may need to edit the source code to save projector weights in the middle.

Apr 22 '24 02:04 Isaachhh

LLaVA LLaVA copied to clipboard

[Question] how to merge the middle checkpoint file with lora

Question

LLaVA
LLaVA copied to clipboard