运行merge_lora 会报错,runtimeError('error(s) ') in loading state_dict for PeftModelForCausalLM,
copying a param with shape torch.size([3072,8,1]) from checkpoint,the shape in current model is torch.size([4608.8])
为什么会出现4608啊
eneration config file not found, using a generation config created from the model config.
07/07/2023 16:36:35 - INFO - utils.common - Fine-tuning method: LoRA
Traceback (most recent call last):
File "……/ChatGLM-Efficient-Tuning/src/train_sft.py", line 105, in
main()
File "……/ChatGLM-Efficient-Tuning/src/train_sft.py", line 25, in main
model, tokenizer = load_pretrained(model_args, finetuning_args, training_args.do_train, stage="sft")
File "……/ChatGLM-Efficient-Tuning/src/utils/common.py", line 244, in load_pretrained
model = init_adapter(model, model_args, finetuning_args, is_trainable)
File "……/ChatGLM-Efficient-Tuning/src/utils/common.py", line 117, in init_adapter
model = PeftModel.from_pretrained(model, checkpoint)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/peft_model.py", line 181, in from_pretrained
model.load_adapter(model_id, adapter_name, **kwargs)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/peft_model.py", line 376, in load_adapter
set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 123, in set_peft_model_state_dict
model.load_state_dict(peft_model_state_dict, strict=False)
File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.2.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.2.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.3.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.3.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.4.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.4.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.5.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.5.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.6.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.6.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.7.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.7.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.8.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.8.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.9.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.9.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.10.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.10.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.11.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.11.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.12.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.12.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.13.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.13.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.14.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.14.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.15.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.15.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.16.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.16.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.17.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.17.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.18.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.18.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.19.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.19.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.20.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.20.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.21.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.21.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.22.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.22.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.23.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.23.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.24.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.24.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.25.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.25.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.26.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.26.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
size mismatch for base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]).
size mismatch for base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]).
```