alpaca-lora
alpaca-lora copied to clipboard
There were missing keys in the checkpoint model loaded
I trained Alpaca-LoRA model with params: base_model: /usr/local/dbbd/model/llama-7b-hf data_path: alpaca_data.json output_dir: ./lora-alpaca batch_size: 128 micro_batch_size: 4 num_epochs: 2 learning_rate: 0.0001 cutoff_len: 512 val_set_size: 2000 lora_r: 8 lora_alpha: 16 lora_dropout: 0.05 lora_target_modules: ['q_proj', 'v_proj'] train_on_inputs: True group_by_length: True wandb_project: wandb_run_name: wandb_watch: wandb_log_model: resume_from_checkpoint: False prompt template: alpaca
But I get a wrong adapter_model.bin which is only 443 bytes, and some output like: There were missing keys in the checkpoint model loaded: ['base_model.model.model.embed_tokens.weight', 'base_model.model.model.layers.0.self_attn.q_proj.weight', 'base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight', '.model.layers.0.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.k_proj.weight', 'base_model.model.model.layers.0.self_attn.v_proj.weight', 'base_model.model.model.layers.0.self_attn.v_proj.lora_A.defaue_model.model.model.layers.0.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.0.self_attn.o_proj.weight', 'base_model.model.model.layers.0.self_attn.rotary_emb.inv_freq', 'base_model.model.model.layers.0.mlp.gate_pse_model.model.model.layers.0.mlp.down_proj.weight', 'base_model.model.model.layers.0.mlp.up_proj.weight', 'base_model.model.model.layers.0.input_layernorm.weight', 'base_model.model.model.layers.0.post_attention_layernorm.weight', 'baseel.layers.1.self_attn.q_proj.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.q_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_attn.k_proj.wdel.model.model.layers.1.self_attn.v_proj.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_A.default.weight', 'base_model.model.model.layers.1.self_attn.v_proj.lora_B.default.weight', 'base_model.model.model.layers.1.self_ht', 'base_model.model.model.layers.1.self_attn.rotary_emb.inv_freq', 'base_model.model.model.layers.1.mlp.gate_proj.weight', 'base_model.model.model.layers.1.mlp.down_proj.weight', 'base_model.model.model.layers.1.mlp.up_proj.weight',
Does anyone know why this is
UPDATED:
Vim your python path /site-packages/peft/utils/save_and_load.py, and comment# the "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" .
My latest conclusion is that we access "def get_peft_model_state_dict" twice, 1,
old_state_dict = model.state_dict
model.state_dict = (
lambda self, *_, **__: get_peft_model_state_dict(
self, old_state_dict()
)
).__get__(model, type(model))
In this step, the "adapter_name =default" has been removed from state_dict.
2,
model.save_pretrained(output_dir)
Above code call get_peft_model_state_dict again, and "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" output empty {} because there is no adapter_name in k.
Why don't we set adapter_name from now on to avoid tweaking lib code?
Defaut name from current peft lib is adapter_name="default"
just adding this at training end in get_peft_model_state_dict(
would solve the issue, right?
Why don't we set adapter_name from now on to avoid tweaking lib code?
Defaut name from current peft lib is
adapter_name="default"
just adding this at training end inget_peft_model_state_dict(
would solve the issue, right?
I first tried to add "default" to get_peft, but output was still empty for no reason..
UPDATED:
Vim your python path /site-packages/peft/utils/save_and_load.py, and comment# the "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" .
My latest conclusion is that we access "def get_peft_model_state_dict" twice, 1,
old_state_dict = model.state_dict model.state_dict = ( lambda self, *_, **__: get_peft_model_state_dict( self, old_state_dict() ) ).__get__(model, type(model))
In this step, the "adapter_name =default" has been removed from state_dict.
2,
model.save_pretrained(output_dir)
Above code call get_peft_model_state_dict again, and "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" output empty {} because there is no adapter_name in k.
Well,I tried your method,but it doesn't work. and it seems that the content of this part has been updated
UPDATED: Vim your python path /site-packages/peft/utils/save_and_load.py, and comment# the "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" . My latest conclusion is that we access "def get_peft_model_state_dict" twice, 1,
old_state_dict = model.state_dict model.state_dict = ( lambda self, *_, **__: get_peft_model_state_dict( self, old_state_dict() ) ).__get__(model, type(model))
In this step, the "adapter_name =default" has been removed from state_dict. 2,
model.save_pretrained(output_dir)
Above code call get_peft_model_state_dict again, and "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" output empty {} because there is no adapter_name in k.Well,I tried your method,but it doesn't work. and it seems that the content of this part has been updated
Only comment the line 52 to # #to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}
I just tried with 10 instance, and it generated 67M adapter_model.bin.
Or you can use the easy way, back to the old PEFT version:
pip uninstall peft -y
pip install git+https://github.com/huggingface/peft.git@e536616888d51b453ed354a6f1e243fecb02ea08
UPDATED: Vim your python path /site-packages/peft/utils/save_and_load.py, and comment# the "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" . My latest conclusion is that we access "def get_peft_model_state_dict" twice, 1,
old_state_dict = model.state_dict model.state_dict = ( lambda self, *_, **__: get_peft_model_state_dict( self, old_state_dict() ) ).__get__(model, type(model))
In this step, the "adapter_name =default" has been removed from state_dict. 2,
model.save_pretrained(output_dir)
Above code call get_peft_model_state_dict again, and "to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}" output empty {} because there is no adapter_name in k.Well,I tried your method,but it doesn't work. and it seems that the content of this part has been updated
Only comment the line 52 to # #to_return = {k: v for k, v in to_return.items() if (("lora_" in k and adapter_name in k) or ("bias" in k))}
I just tried with 10 instance, and it generated 67M adapter_model.bin.
Or you can use the easy way, back to the old PEFT version:
pip uninstall peft -y pip install git+https://github.com/huggingface/peft.git@e536616888d51b453ed354a6f1e243fecb02ea08
Thank you very much, your suggestion is useful, I reinstalled peft. But I have another problem, I fine-tune it on my own dataset with the default parameters, the effect is not good. Do you have any suggestions
Without changing the source peft code, in my case, you can just remove the following line in finetuning.py to solve save adapter_model.bin
twice and only get the last empty checkpoint with the missing key warning.
https://github.com/tloen/alpaca-lora/blob/8bb8579e403dc78e37fe81ffbb253c413007323f/finetune.py#L261-L272
Without changing the source peft code, in my case, you can just remove the following line in finetuning.py to solve save
adapter_model.bin
twice and only get the last empty checkpoint with the missing key warning.https://github.com/tloen/alpaca-lora/blob/8bb8579e403dc78e37fe81ffbb253c413007323f/finetune.py#L261-L272
I tried your approach, and now I'm getting quantization errors RuntimeError: Loading a quantized checkpoint into non-quantized Linear8bitLt is not supported. Please call module.cuda() before module.load_state_dict()
I'm using an RTX 4090
old_state_dict = model.state_dict
state_dict = (
lambda self, *_, **__: get_peft_model_state_dict(
self, old_state_dict()
)
).__get__(model, type(model))()
set_peft_model_state_dict(model, state_dict)
@darrenwang00 你好,是在后面加一个set_peft_model_state_dict(model, state_dict)这个吗
hi, did you solve the problem with the code set_peft_model_state_dict(model, state_dict)
?
how was the dict saved? There are 4 different ways to save a model.
model.save_pretrained(PATH)
trainer.save_model(PATH)
and
torch.save({'model_state_dict': model.state_dict()})
TrainerArgs(save_strategy='steps')
.
Which one can I use to store the PeftModelForCausalLM(AutoModelForCausalLM())
and how to load it again?
using torch.save({'model_state_dict': model.state_dict()})
leads to a bunch of _IncompatibleKeys(missing_keys=['base_model.model.transformer.wte.weight', ...