Qwen3-VL IndexError: index 0 is out of bounds for dimension 0 with size 0 for " if cache_position is None or (cache_position is not None and cache

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████| 38/38 [00:33<00:00, 1.14it/s] tensor([], device='cuda:0', dtype=torch.int64) Traceback (most recent call last): File "/home/xzy/xjy/qwen/test.py", line 55, in generated_ids = model.generate(**inputs, max_new_tokens=128) File "/home/xzy/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/home/xzy/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2015, in generate result = self._sample( File "/home/xzy/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2958, in _sample model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs) File "/home/xzy/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1676, in prepare_inputs_for_generation if cache_position is None or (cache_position is not None and cache_position[0] == 0): IndexError: index 0 is out of bounds for dimension 0 with size 0

Dec 15 '24 14:12 xjywhu

I met the same problem as you, have you solved it?

Dec 17 '24 01:12 ysy597610459

Sry, I have not solved.

Dec 17 '24 03:12 xjywhu

+1 Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 38/38 [04:06<00:00, 6.48s/it] Traceback (most recent call last): File "/picassox/sfs-mtlab-train-base/segmentation/lzj7/qwen_caption.py", line 46, in generated_ids = model.generate(**inputs, max_new_tokens=128) File "/root/miniforge3/envs/qwen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/root/miniforge3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2015, in generate result = self._sample( File "/root/miniforge3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2958, in _sample model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs) File "/root/miniforge3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1675, in prepare_inputs_for_generation if cache_position is None or (cache_position is not None and cache_position[0] == 0): IndexError: index 0 is out of bounds for dimension 0 with size 0

Dec 17 '24 07:12 zijieloooooou

same issue

Dec 17 '24 12:12 shannany0606

I also occur this error. When i used the qwen2-vl-instruct to sft model, then i used the model inference is right, but i used the qwen2-vl-base to sft model, then i used the model inference is wrong. Did you solve this problem？ I found this problem is from inputs, when i printed the inputs, the text token is empty tensor. inputs = self.processor( text=[text], images=image_inputs, videos=video_inputs, padding=True, return_tensors="pt", )

Dec 18 '24 03:12 SunJiaheng66

Has anyone solved this problem?

Dec 26 '24 03:12 zijieloooooou

maybe you download the wrong model, check you download "Qwen2-VL-*B-Instruct" or "Qwen/Qwen2-VL-*B"

Dec 26 '24 03:12 syyxtl

发现在fsdp训练下会有这个问题，加上CUDA_LAUNCH_BLOCKING=1 之后会发现报错其实是在上一行

input_ids = input_ids[:, cache_position]

看到Qwen2VLForConditionalGeneration是继承了GenerationMixin，对比二者prepare_inputs_for_generation实现区别

Qwen2VLForConditionalGeneration#prepare_inputs_for_generation
GenerationMixin#prepare_inputs_for_generation

GenerationMixin里多了一段 Exception 3的描述和对应的条件

# 2. Generic cache-dependent input preparation
# If we have cache: let's slice `input_ids` through `cache_position`, to keep only the unprocessed tokens
# Exception 1: when passing input_embeds, input_ids may be missing entries
# Exception 2: some generation methods do special slicing of input_ids, so we don't need to do it here
# Exception 3: with synced GPUs cache_position may go out of bounds, but we only want dummy token in that case.
#              (we can't check exception 3 while compiling)

...
if (
        inputs_embeds is not None  # Exception 1
        or (is_torchdynamo_compiling() or cache_position[-1] >= input_ids.shape[1])  # Exception 3
    ):
        input_ids = input_ids[:, -cache_position.shape[0] :]

把这个条件补全之后能正常推理

依赖版本

transformers                  4.47.1

Dec 31 '24 08:12 steermomo

@steermomo Can you be more specific? I'm a rookie.

Dec 31 '24 08:12 Jiawei-Guo

I encountered the same problem. The root cause is that the chat_template of Qwen2-VL and Qwen2-VL-Instruct is different, which causes the input_ids to be empty before inference. So the solution is to replace the chat_template.json of Qwen2-VL with the chat_template.json of Instruct. It solved my problem.

Jan 08 '25 07:01 yuanllong

I encountered the same problem. The root cause is that the chat_template of Qwen2-VL and Qwen2-VL-Instruct is different, which causes the input_ids to be empty before inference. So the solution is to replace the chat_template.json of Qwen2-VL with the chat_template.json of Instruct. It solved my problem.

Thanks! It works for me.

Jan 10 '25 08:01 zijieloooooou

I encountered the same problem. The root cause is that the chat_template of Qwen2-VL and Qwen2-VL-Instruct is different, which causes the input_ids to be empty before inference. So the solution is to replace the chat_template.json of Qwen2-VL with the chat_template.json of Instruct. It solved my problem.

Thanks! It works for me.

你是做什么任务的，可以交流一下吗？

Jan 10 '25 11:01 yuanllong

Thank you, can some one please fix this? I almost ditched this model.

Jan 13 '25 10:01 Lampent

I encountered the same problem. The root cause is that the chat_template of Qwen2-VL and Qwen2-VL-Instruct is different, which causes the input_ids to be empty before inference. So the solution is to replace the chat_template.json of Qwen2-VL with the chat_template.json of Instruct. It solved my problem.

where is this file stored? can you share steps to fix the problem please?

Jan 22 '25 18:01 natwille1

感谢！确实是chat_template.json的原因！

May 22 '25 03:05 h66840

IndexError: index 0 is out of bounds for dimension 0 with size 0 for " if cache_position is None or (cache_position is not None and cache_position[0] == 0):"