[Bug]: Llama Format输出出错误
Is there an existing issue ? / 是否已有相关的 issue ?
- [X] I have searched, and there is no existing issue. / 我已经搜索过了,没有相关的 issue。
Describe the bug / 描述这个 bug
我使用Llama Format model来推理,当提示词设置默认为example时(prompt="Now you act like a terminal situated within a beginner's C++ practice repository folder, please provide the output for the command: ls -l")
得到的答案并不符合预期,甚至还有一些乱码。具体输出:
To Reproduce / 如何复现
huggingface download "openbmb/MiniCPM-2B-dpo-bf16-llama-format" run script MiniCPM-2B (Llama Format) `import torch from transformers import LlamaTokenizerFast, LlamaForCausalLM model_path = "openbmb/MiniCPM-2B-dpo-bf16-llama-format" tokenizer = LlamaTokenizerFast.from_pretrained(model_path) model = LlamaForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)
prompt="Now you act like a terminal situated within a beginner's C++ practice repository folder, please provide the output for the command: ls -l"
input_ids = tokenizer.encode("<用户>{}<AI>".format(prompt), return_tensors='pt', add_special_tokens=True).cuda()
responds = model.generate(input_ids, temperature=0.3, top_p=0.8, repetition_penalty=1.02, max_length=1024)
responds = tokenizer.decode(responds[0], skip_special_tokens=True)
print(responds)`
The output is bad.
Expected behavior / 期望的结果
正确回答prompt
Screenshots / 截图
No response
Environment / 环境
- OS: ubuntu 20.04
- torch: 1.13.1+cu116
- torchvision: 0.14.1+cu116
- tokenizers: 0.15.2
- transformers: 4.36.0
- Device: A100
Additional context / 其他信息
Thanks!
I used the same example code and got the same problem. When the model is trying to load the pretrained weights, I got the following warning
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at openbmb/MiniCPM-2B-dpo-bf16-llama-format and are newly initialized: ['lm_head.weight']
So I think lm_head.weight not correctly loaded leads to the bad results
This problem may be caused by tie_weights for embedding and lm_head. I guess MiniCPM is reusing embedding.weight for lm_head.weight, but LlamaForCausalLM do not automatically do that.
The simplest way to fix this is running model.lm_head.weight = model.model.embed_tokens.weight (or model.lm_head.weight = torch.nn.Parameter(model.model.embed_tokens.weight.clone())) after model initialization.
Here is a blog that explains tie_weights in transformers. You may find it helpful.
This problem may be caused by tie_weights for embedding and lm_head. I guess MiniCPM is reusing embedding.weight for lm_head.weight, but LlamaForCausalLM do not automatically do that.
The simplest way to fix this is running
model.lm_head.weight = model.model.embed_tokens.weight(ormodel.lm_head.weight = torch.nn.Parameter(model.model.embed_tokens.weight.clone())) after model initialization.Here is a blog that explains tie_weights in transformers. You may find it helpful.
It's a great work for me! Thanks.