ChatGLM-Finetuning 预测代码

运行merge_lora 会报错,runtimeError('error(s) ') in loading state_dict for PeftModelForCausalLM,

copying a param with shape torch.size([3072,8,1]) from checkpoint,the shape in current model is torch.size([4608.8])

为什么会出现4608啊

Sep 28 '23 08:09 qaqrt

eneration config file not found, using a generation config created from the model config. 07/07/2023 16:36:35 - INFO - utils.common - Fine-tuning method: LoRA Traceback (most recent call last): File "……/ChatGLM-Efficient-Tuning/src/train_sft.py", line 105, in main() File "……/ChatGLM-Efficient-Tuning/src/train_sft.py", line 25, in main model, tokenizer = load_pretrained(model_args, finetuning_args, training_args.do_train, stage="sft") File "……/ChatGLM-Efficient-Tuning/src/utils/common.py", line 244, in load_pretrained model = init_adapter(model, model_args, finetuning_args, is_trainable) File "……/ChatGLM-Efficient-Tuning/src/utils/common.py", line 117, in init_adapter model = PeftModel.from_pretrained(model, checkpoint) File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/peft_model.py", line 181, in from_pretrained model.load_adapter(model_id, adapter_name, **kwargs) File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/peft_model.py", line 376, in load_adapter set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name) File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 123, in set_peft_model_state_dict model.load_state_dict(peft_model_state_dict, strict=False) File "……/miniconda3/envs/glm_tuning/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.0.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.1.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.2.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.2.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.3.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.3.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.4.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.4.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.5.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.5.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.6.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.6.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.7.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.7.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.8.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.8.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.9.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.9.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.10.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.10.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.11.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.11.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.12.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.12.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.13.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.13.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.14.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.14.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.15.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.15.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.16.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.16.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.17.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.17.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.18.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.18.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.19.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.19.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.20.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.20.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.21.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.21.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.22.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.22.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.23.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.23.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.24.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.24.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.25.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.25.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.26.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.26.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). size mismatch for base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_A.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([64, 4096]). size mismatch for base_model.model.transformer.encoder.layers.27.self_attention.query_key_value.lora_B.default.weight: copying a param with shape torch.Size([0]) from checkpoint, the shape in current model is torch.Size([4608, 64]). ```

Sep 28 '23 08:09 qaqrt

lora训练时，参数维度lora_dim是否一致？

Jan 17 '24 03:01 liucongg