ChatGLM-6B ptuning评估的时候报错：The expanded size of the tensor (140) must match the existing size (312) at non-singleton dimension 0. Target sizes: [140]. Tensor sizes: [312]

ptuning评估的时候报错：The expanded size of the tensor (140) must match the existing size (312) at non-singleton dimension 0. Target sizes: [140]. Tensor sizes: [312]

Open yanqiangmiffy opened this issue 1 year ago • 5 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

  File "/ptuning/main.py", line 416, in <module>
    main()
  File "/ptuning/main.py", line 354, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/trainer.py", line 1633, in train
    return inner_training_loop(
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/trainer.py", line 1979, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/trainer.py", line 2236, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "/ptuning/trainer_seq2seq.py", line 78, in evaluate
    return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/trainer.py", line 2932, in evaluate
    output = eval_loop(
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/trainer.py", line 3113, in evaluation_loop
    loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
  File "/ptuning/trainer_seq2seq.py", line 200, in prediction_step
    generated_tokens = self.model.generate(**gen_kwargs)
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/generation/utils.py", line 1490, in generate
    return self.beam_search(
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/generation/utils.py", line 2836, in beam_search
    sequence_outputs = beam_scorer.finalize(
  File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/generation/beam_search.py", line 377, in finalize
    decoded[i, : sent_lengths[i]] = hypo
RuntimeError: The expanded size of the tensor (140) must match the existing size (312) at non-singleton dimension 0.  Target sizes: [140].  Tensor sizes: [312]
  3%|█████▍

Expected Behavior

File "/home/searchgpt/anaconda3/envs/chatglm_lora/lib/python3.10/site-packages/transformers/trainer.py", line 2236, in _maybe_log_save_evaluate metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)

在评估的时候报错，有一部分数据会正常，然后有有些数据评估的时候报错

Steps To Reproduce

运行参数： args = [ '--model_name_or_path=../../pretrained_models/chatglm-6b', '--do_train', '--do_eval', '--do_predict', '--overwrite_cache', '--overwrite_output_dir', '--per_device_train_batch_size=16', '--per_device_eval_batch_size=1', '--gradient_accumulation_steps=1', '--predict_with_generate', '--max_steps=3000', '--logging_steps=50', '--eval_steps=80', '--evaluation_strategy=steps', '--save_steps=1000', '--learning_rate=1e-2', '--output_dir=output', '--pre_seq_len=4', '--quantization_bit=4', '--max_source_length=420', '--max_target_length=140', '--val_max_target_length=140', '--num_beams=4', '--max_eval_samples=100' ]

Environment

运行环境：

- tensorboard              2.12.0
- tensorboard-data-server  0.7.0
- tensorboard-plugin-wit   1.8.1
- termcolor                2.2.0
- threadpoolctl            3.1.0
- tokenizers               0.12.1
- torch                    2.0.0
- torchvision              0.15.1
- tqdm                     4.65.0
- transformers             4.27.4
- triton                   2.0.0

Anything else?

No response

Apr 01 '23 03:04 yanqiangmiffy

不过跑官方的数据集和例子是没问题的，换了成自己的数据集会报错。比较明显的地方就是自己的数据集长度普遍偏长

Apr 01 '23 12:04 yanqiangmiffy

你不要执行model.eval()

Apr 15 '23 15:04 cywjava

我也是这个问题，请问你解决了吗？

May 05 '23 08:05 jiejue26

同样的问题，怎么解决呢？在线等 ChatGLM2/modeling_chatglm.py", line 228, in forward context_layer = torch.nn.functional.scaled_dot_product_attention(query_layer, key_layer, value_layer, RuntimeError: The size of tensor a (47) must match the size of tensor b (15) at non-singleton dimension 3

Jul 06 '23 09:07 ThinkThrice

transformers==4.32.0

Apr 08 '24 11:04 Hukongtao

ChatGLM-6B ChatGLM-6B copied to clipboard

ptuning评估的时候报错：The expanded size of the tensor (140) must match the existing size (312) at non-singleton dimension 0. Target sizes: [140]. Tensor sizes: [312]

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard