ChatGLM-Efficient-Tuning icon indicating copy to clipboard operation
ChatGLM-Efficient-Tuning copied to clipboard

p_tuning模式报错RuntimeError: expected scalar type Half but found Float

Open hezhefly opened this issue 1 year ago • 0 comments

错误内容如下:

  0%|                                                                                               | 0/14400 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/home/jovyan/fast-data/ChatGLM-Efficient-Tuning-main/src/train_sft.py", line 109, in <module>
    main()
  File "/home/jovyan/fast-data/ChatGLM-Efficient-Tuning-main/src/train_sft.py", line 77, in main
    train_result = trainer.train()
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/transformers/trainer.py", line 1645, in train
    return inner_training_loop(
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/transformers/trainer.py", line 1938, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/transformers/trainer.py", line 2759, in training_step
    loss = self.compute_loss(model, inputs)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/transformers/trainer.py", line 2784, in compute_loss
    outputs = model(**inputs)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm/modeling_chatglm.py", line 1190, in forward
    transformer_outputs = self.transformer(
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm/modeling_chatglm.py", line 985, in forward
    layer_ret = torch.utils.checkpoint.checkpoint(
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 249, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 107, in forward
    outputs = run_function(*args)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jovyan/.cache/huggingface/modules/transformers_modules/chatglm/modeling_chatglm.py", line 624, in forward
    attention_input = self.input_layernorm(hidden_states)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/home/jovyan/miniconda3/envs/chatglm/lib/python3.10/site-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Half but found Float

参考:https://github.com/tloen/alpaca-lora/issues/203的方案 在src/train_sft.py的75行加入:

# 头部引入import torch
# Training
if training_args.do_train:
    with torch.autocast("cuda"):

Bug修复

hezhefly avatar Jul 02 '23 09:07 hezhefly