swift Infer internlm-xcomposer2 lead to `ValueError: Input length of input_ids is 0, but `max

Infer internlm-xcomposer2 lead to `ValueError: Input length of input_ids is 0, but `max_length` is set to -1066.`

Open piqiuni opened this issue 2 months ago • 2 comments

Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图)

Running finetuned internlm-xcomposer2-7b-chat lead to error.

token len:history:113, query:1706

Traceback (most recent call last):
  File "/home/ldl/pi_code/swift/pi_code/infer_internlm-xcomposer2.py", line 83, in <module>
    response, _ = inference(model, template, value, history)
  File "/home/ldl/pi_code/swift/swift/llm/utils/utils.py", line 692, in inference
    generate_ids = model.generate(
  File "/home/ldl/miniconda3/envs/swift/lib/python3.10/site-packages/peft/peft_model.py", line 1190, in generate
    outputs = self.base_model.generate(*args, **kwargs)
  File "/home/ldl/miniconda3/envs/swift/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ldl/miniconda3/envs/swift/lib/python3.10/site-packages/transformers/generation/utils.py", line 1449, in generate
    self._validate_generated_length(generation_config, input_ids_length, has_default_max_length)
  File "/home/ldl/miniconda3/envs/swift/lib/python3.10/site-packages/transformers/generation/utils.py", line 1140, in _validate_generated_length
    raise ValueError(
ValueError: Input length of input_ids is 0, but `max_length` is set to -1066. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`.

Also, running web-ui with the training data got wrong reply. CUDA_VISIBLE_DEVICES=0 swift app-ui --share True --ckpt_dir ckp_output/internlm-xcomposer2-7b-chat/v10-20240502-202001/checkpoint-60/ --load_dataset_config true

I'm sorry,butIamunabletoanalyzetheimagesaslamatext-basedAlassistant and do not have the ability to view or interpret images.

Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等)

Additional context Add any other context about the problem here(在这里补充其他信息)

May 02 '24 13:05 piqiuni

with CUDA_VISIBLE_DEVICES=0,1 swift infer --ckpt_dir ckp_output/internlm-xcomposer2-7b-chat/v10-20240502-202001/checkpoint-60/ it works, but the VMem usage goes from 18GB to 48+GB with only 4 questions(6pics+text) and lead to OOM? Is this a bug?

May 02 '24 13:05 piqiuni

Change model.generation_config.max_new_tokens solved the problem. But I'm still confused about the difference of these params, can you help to explain it?

form https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-xcomposer2-7b/files
model.config.max_length
model.generation_config.max_length
model.generation_config.max_length
model.generation_config.max_new_tokens

May 09 '24 03:05 piqiuni

swift swift copied to clipboard

Infer internlm-xcomposer2 lead to `ValueError: Input length of input_ids is 0, but `max_length` is set to -1066.`

swift
swift copied to clipboard