ChatGLM-6B [BUG/Help] 在官方Ptuning文档的帮助下，微调了模型，并加载了原模型和微调后的模型，但是却返回RuntimeError，BFloat16

[BUG/Help] 在官方Ptuning文档的帮助下，微调了模型，并加载了原模型和微调后的模型，但是却返回RuntimeError，BFloat16

Open grep-w opened this issue 2 years ago • 3 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

您好，我使用官方的文档微调模型以后，加载后却出现该问题

RuntimeError: mixed dtype (CPU): expect input to have scalar type of BFloat16

Expected Behavior

No response

Steps To Reproduce

config = AutoConfig.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True, pre_seq_len=128) model = AutoModel.from_pretrained("THUDM/chatglm-6b", config=config, trust_remote_code=True) prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin")) new_prefix_state_dict = {} for k, v in prefix_state_dict.items(): if k.startswith("transformer.prefix_encoder."): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

Environment

- OS:centos7
- Python:3.9
- Transformers:
- PyTorch:
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Apr 14 '23 07:04 grep-w

你是在CPU上微调的吗

Apr 14 '23 09:04 duzx16

你是在CPU上微调的吗

我用16G的V100微调的，但是没有使用量化方法

Apr 15 '23 05:04 grep-w

这边也碰到了同样的报错。根据 https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning 最小显存参数训练。 “模型部署”中的两种方式使用都是一样的报错。

Apr 17 '23 09:04 fireice009

这边也碰到了同样的报错。根据 https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning 最小显存参数训练。 “模型部署”中的两种方式使用都是一样的报错。

后面我尝试了CPU推理就成功了，代码如下： ` #model = AutoModel.from_pretrained("model", config=config, trust_remote_code=True).half().cuda()

model = AutoModel.from_pretrained("model", config=config, trust_remote_code=True).float()

prefix_state_dict = torch.load(os.path.join("output\checkpoint-20000", "pytorch_model.bin"))

new_prefix_state_dict = {}

for k, v in prefix_state_dict.items(): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v

model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

Apr 18 '23 00:04 grep-w

这边也碰到了同样的报错。根据 https://github.com/THUDM/ChatGLM-6B/tree/main/ptuning 最小显存参数训练。 “模型部署”中的两种方式使用都是一样的报错。

后面我尝试了CPU推理就成功了，代码如下： ` #model = AutoModel.from_pretrained("model", config=config, trust_remote_code=True).half().cuda()

model = AutoModel.from_pretrained("model", config=config, trust_remote_code=True).float()

prefix_state_dict = torch.load(os.path.join("output\checkpoint-20000", "pytorch_model.bin"))

new_prefix_state_dict = {}

for k, v in prefix_state_dict.items(): new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v

model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

`

CPU 推理慢很多吧。10万条语料训练出来的参数，10核 Xeon 8255C 跑满，平均一条耗时20秒左右。

Apr 19 '23 12:04 fireice009

model.eval()前增加model.half().cuda()即可，不过推理出来的有些乱码

Apr 22 '23 02:04 Mylszd

将这个后面加上.half().cuda() model = AutoModel.from_pretrained("THUDM/chatglm-6b", config=config, trust_remote_code=True).half().cuda()

May 09 '23 06:05 Aleluya009

ChatGLM-6B ChatGLM-6B copied to clipboard

[BUG/Help] 在官方Ptuning文档的帮助下，微调了模型，并加载了原模型和微调后的模型，但是却返回RuntimeError，BFloat16

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard