mlx-vlm
mlx-vlm copied to clipboard
InternVL3 Multi-Image Support Broken
When I pass more than one image as input to either InternVL3-1B-4bit or InternVL3-2B-4bit I get the following error even though the same image array works with SmolVLM2-500M-Video-Instruct and llava-interleave-qwen-0.5b-4bit:
Files: [<PIL.Image.Image image mode=RGB size=1280x720 at 0x10F0C6690>, <PIL.Image.Image image mode=RGB size=1280x720 at 0x10F0B1820>, <PIL.Image.Image image mode=RGB size=1280x720 at 0x151178770>, <PIL.Image.Image image mode=RGB size=1280x720 at 0x1511787D0>]
Prompt: User: <image>
<image>
<image>
<image>
Describe this video.
Assistant:
Warning: Failed to process inputs with error: list index out of range Trying to process inputs with return_tensors='pt'
Traceback (most recent call last):
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 821, in process_inputs_with_fallback
inputs = process_inputs(
^^^^^^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 813, in process_inputs
inputs = processor(
^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/models/internvl_chat/processor.py", line 318, in __call__
question = text[idx]
~~~~^^^^^
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 830, in process_inputs_with_fallback
inputs = process_inputs(processor, images, prompts, return_tensors="pt")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 813, in process_inputs
inputs = processor(
^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/models/internvl_chat/processor.py", line 318, in __call__
question = text[idx]
~~~~^^^^^
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/moroneyt/Documents/Vision-Testing/vlm-test.py", line 65, in <module>
output = generate(model, processor, formatted_prompt, frames, verbose=False, max_tokens=100)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1208, in generate
for response in stream_generate(model, processor, prompt, image, **kwargs):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1096, in stream_generate
inputs = prepare_inputs(
^^^^^^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 886, in prepare_inputs
inputs = process_inputs_with_fallback(processor, images, prompts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/moroneyt/Documents/Vision-Testing/venv/lib/python3.12/site-packages/mlx_vlm/utils.py", line 832, in process_inputs_with_fallback
raise ValueError(
ValueError: Failed to process inputs with error: list index out of range. Please install PyTorch and try again.