Qwen
Qwen copied to clipboard
Size mismatch, Error(s) in loading model finetuned by lora
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
I finetune the QWen-7B, and when I use the finetuned model, I met some error:
root@1fc7d6985d8b:/Fine/Qwen-main# python3 cli_demo.py
/usr/local/lib/python3.8/dist-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained".
Try importing flash-attention for faster inference...
Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:10<00:00, 1.30s/it]
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embeding dimension will be 151851. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Traceback (most recent call last):
File "cli_demo.py", line 217, in <module>
main()
File "cli_demo.py", line 123, in main
model, tokenizer, config = _load_model_tokenizer(args)
File "cli_demo.py", line 60, in _load_model_tokenizer
model = AutoPeftModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.8/dist-packages/peft/auto.py", line 128, in from_pretrained
return cls._target_peft_class.from_pretrained(
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 353, in from_pretrained
model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 697, in load_adapter
load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
File "/usr/local/lib/python3.8/dist-packages/peft/utils/save_and_load.py", line 249, in set_peft_model_state_dict
load_result = model.load_state_dict(peft_model_state_dict, strict=False)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.transformer.wte.modules_to_save.default.weight: copying a param with shape torch.Size([151936, 4096]) from checkpoint, the shape in current model is torch.Size([151851, 4096]).
size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([151936, 4096]) from checkpoint, the shape in current model is torch.Size([151851, 4096]).
I finetuned the model by use:
bash finetune_lora_single_gpu.sh -d my_train_data.json
and I modified my cli-demo.py follow the tutorial:
model = AutoPeftModelForCausalLM.from_pretrained(
model_path, # path to the output directory or model name
device_map=device_map,
trust_remote_code=True,
).eval()
I found some probably related issues like #419 #482 , but they can't solve my problem.
期望行为 | Expected Behavior
demo runs normally.
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS: 22.04.1-Ubuntu
- Python: 3.8
- Transformers:4.32.0
- PyTorch:2.2.1
- CUDA 12.1
备注 | Anything else?
No response
Something seems wrong with the vocab_size (which is the size of the embedding, not the actual vocabulary size) in config.json and the pad_to_multiple_of setting.
Please try upgrade transformers<4.38.0
and downgrade peft<0.8.0
first and provide the content of config.json.
Okay, I upgrade transformers to 4.38.0, and use peft=0.7.0 now, but meet some errors which I haven't met when I use peft=0.9.0.
config.json is here, I copied it from https://huggingface.co/Qwen/Qwen-7B-Chat/blob/main/config.json
{
"architectures": [
"QWenLMHeadModel"
],
"auto_map": {
"AutoConfig": "configuration_qwen.QWenConfig",
"AutoModelForCausalLM": "modeling_qwen.QWenLMHeadModel"
},
"attn_dropout_prob": 0.0,
"bf16": false,
"emb_dropout_prob": 0.0,
"fp16": false,
"fp32": false,
"hidden_size": 4096,
"intermediate_size": 22016,
"initializer_range": 0.02,
"kv_channels": 128,
"layer_norm_epsilon": 1e-06,
"max_position_embeddings": 32768,
"model_type": "qwen",
"no_bias": true,
"num_attention_heads": 32,
"num_hidden_layers": 32,
"onnx_safe": null,
"rotary_emb_base": 10000,
"rotary_pct": 1.0,
"scale_attn_weights": true,
"seq_length": 8192,
"tie_word_embeddings": false,
"tokenizer_class": "QWenTokenizer",
"transformers_version": "4.32.0",
"use_cache": true,
"use_dynamic_ntk": true,
"use_flash_attn": "auto",
"use_logn_attn": true,
"vocab_size": 151936
}
There should be an adapter_config.json as well. Let's see what's there. I think peft is changing the vocab_size somewhere.
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。
Okay, I upgrade transformers to 4.38.0, and use peft=0.7.0 now, but meet some errors which I haven't met when I use peft=0.9.0.
config.json is here, I copied it from https://huggingface.co/Qwen/Qwen-7B-Chat/blob/main/config.json
{ "architectures": [ "QWenLMHeadModel" ], "auto_map": { "AutoConfig": "configuration_qwen.QWenConfig", "AutoModelForCausalLM": "modeling_qwen.QWenLMHeadModel" }, "attn_dropout_prob": 0.0, "bf16": false, "emb_dropout_prob": 0.0, "fp16": false, "fp32": false, "hidden_size": 4096, "intermediate_size": 22016, "initializer_range": 0.02, "kv_channels": 128, "layer_norm_epsilon": 1e-06, "max_position_embeddings": 32768, "model_type": "qwen", "no_bias": true, "num_attention_heads": 32, "num_hidden_layers": 32, "onnx_safe": null, "rotary_emb_base": 10000, "rotary_pct": 1.0, "scale_attn_weights": true, "seq_length": 8192, "tie_word_embeddings": false, "tokenizer_class": "QWenTokenizer", "transformers_version": "4.32.0", "use_cache": true, "use_dynamic_ntk": true, "use_flash_attn": "auto", "use_logn_attn": true, "vocab_size": 151936 }
Hi, have you solved this problem?
Okay, I upgrade transformers to 4.38.0, and use peft=0.7.0 now, but meet some errors which I haven't met when I use peft=0.9.0.
config.json is here, I copied it from https://huggingface.co/Qwen/Qwen-7B-Chat/blob/main/config.json
{ "architectures": [ "QWenLMHeadModel" ], "auto_map": { "AutoConfig": "configuration_qwen.QWenConfig", "AutoModelForCausalLM": "modeling_qwen.QWenLMHeadModel" }, "attn_dropout_prob": 0.0, "bf16": false, "emb_dropout_prob": 0.0, "fp16": false, "fp32": false, "hidden_size": 4096, "intermediate_size": 22016, "initializer_range": 0.02, "kv_channels": 128, "layer_norm_epsilon": 1e-06, "max_position_embeddings": 32768, "model_type": "qwen", "no_bias": true, "num_attention_heads": 32, "num_hidden_layers": 32, "onnx_safe": null, "rotary_emb_base": 10000, "rotary_pct": 1.0, "scale_attn_weights": true, "seq_length": 8192, "tie_word_embeddings": false, "tokenizer_class": "QWenTokenizer", "transformers_version": "4.32.0", "use_cache": true, "use_dynamic_ntk": true, "use_flash_attn": "auto", "use_logn_attn": true, "vocab_size": 151936 }
Hi, have you solved this problem?
I'm sorry I didn't follow up on this issue.
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。