ChatGLM-6B icon indicating copy to clipboard operation
ChatGLM-6B copied to clipboard

[BUG] evaluate时predict结果为空,

Open micrazy opened this issue 1 year ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

evaluate.sh内容: PRE_SEQ_LEN=128 CHECKPOINT=viewgen0421-chatglm-6b-pt-128-2e-2 STEP=5000

CUDA_VISIBLE_DEVICES=1 python3 main.py
--do_predict
--validation_file /home/workspace/data/dev.json
--test_file /home/workspace/data/dev.json
--overwrite_cache
--prompt_column content
--response_column summary
--model_name_or_path /home/workspace/chatglm/chatglm-6B
--ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP
--output_dir ./output/$CHECKPOINT
--overwrite_output_dir
--max_source_length 512
--max_target_length 512
--per_device_eval_batch_size 1
--predict_with_generate
--pre_seq_len $PRE_SEQ_LEN
--quantization_bit 4

日志输出warning: Input length of input_ids is 512, but max_length is set to 512. This can lead to unexpected behavior. You should consider increasing max_new_tokens.

导致 rouge.get_scores报错 ValueError: Hypothesis is empty. https://github.com/THUDM/ChatGLM-6B/blob/aeced3619b804d20d2396576f6d5bc8dc8226913/ptuning/main.py#L328

尝试调整max_length =1025 ,可以修复这个问题 https://github.com/THUDM/ChatGLM-6B/blob/aeced3619b804d20d2396576f6d5bc8dc8226913/ptuning/main.py#L397

请问这个原因是啥?

Expected Behavior

No response

Steps To Reproduce

evaluate.sh入参 --max_source_length 512 --max_target_length 512 可以触发

Environment

- OS: centos 8
- Python:3.9
- Transformers:4.26.1
- PyTorch:1.12
- CUDA Support True

Anything else?

No response

micrazy avatar Apr 24 '23 08:04 micrazy

+1

michelleqyhqyh avatar Apr 25 '23 06:04 michelleqyhqyh

另发现padding较多也会输出为空。

luolanfeixue avatar Apr 26 '23 07:04 luolanfeixue

I met the same question!

cowarder avatar May 05 '23 07:05 cowarder

同问!! 想知道PRE_SEQ_LEN、max_source_length和max_traget_length的关系是什么?

LOGIC-10 avatar May 05 '23 09:05 LOGIC-10

There are null values in the data, just clean up the data.

Chiang97912 avatar May 22 '23 06:05 Chiang97912

There are null values in the data, just clean up the data. I checked that there are no null values in my data.

micrazy avatar May 22 '23 06:05 micrazy

In my case, the model prediction hypothesis only contains one newline character, which causes the rouge calculation error, so we need to judge the model output and skip the empty output. image To solve this problem, you need to change ptuning/main.py#L327 to the following code:

            hypothesis = ' '.join(hypothesis)
            reference = ' '.join(reference)
            if not hypothesis.strip() or not reference.strip():
                continue
            scores = rouge.get_scores(hypothesis , reference)

Chiang97912 avatar May 22 '23 08:05 Chiang97912

同样遇到。 明显的bug,这里eval和predict的长度应该和train.sh的参数保持一致,否则tokenizer有问题,推理后解码出来全是

insist93 avatar May 25 '23 10:05 insist93