lava-v1.6: unsupported operand type(s) for //: 'int' and 'NoneType'
These models used to work OK, but we now get:
mlx version: 0.22.0.dev20250110+1ce0c0fcb
mlx-vlm version: 0.1.10
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/llava-v1.6-34b-8bit
Fetching 17 files: 100%|█████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 10828.12it/s]
Fetching 17 files: 100%|█████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 12417.83it/s]
==========
Image: ['/Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg']
Prompt: <|im_start|>user
<image>
Provide a factual caption, description and comma-separated keywords or tags for this image so that it can be searched for easily.<|im_end|>
<|im_start|>assistant
Failed to generate output for model at mlx-community/llava-v1.6-34b-8bit: unsupported operand type(s) for //: 'int' and 'NoneType'
********************************************************************************
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
Running mlx-community/llava-v1.6-mistral-7b-8bit
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 11227.22it/s]
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 26024.64it/s]
==========
Image: ['/Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg']
Prompt: [INST] <image>
Provide a factual caption, description and comma-separated keywords or tags for this image so that it can be searched for easily. [/INST]
Failed to generate output for model at mlx-community/llava-v1.6-mistral-7b-8bit: unsupported operand type(s) for //: 'int' and 'NoneType'
********************************************************************************
I can't replicate this.
Could you provide the full traceback?
Not sure what further traceback I can offer. In my case, the smoke test produces the same thing.
There is a full traceback to that error that you are not printing in your tests. I need it to undestand where the error you are getting is located.
I ran the smoke test with that model and it passed on my side:
mlx version: 0.22.0.dev20250110+1ce0c0fcb
Please note that you continue to use an unoffical release of mlx
I would recommend you uninstall it and install the official and try again.
pip uninstall mlx
pip install -U mlx
Outside of that we are running the same version of mlx-vlm so it should work.
Let me know how if the error persists after you install the official release.
It may well be an mlx issue. I expect that there is some way to print the full stack trace to identify the culprit.
python smoke_test.py --models-file llava.txt --image /Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg
0%| | 0/2 [00:00<?, ?it/s]╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Testing mlx-community/llava-v1.6-34b-8bit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Loading model...
Fetching 17 files: 100%|█████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 10761.12it/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Fetching 17 files: 100%|█████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 14312.16it/s]
✓ Model loaded successfully | 0/17 [00:00<?, ?it/s]
Testing vision-language generation...
==========
Image: ['/Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg']
Prompt: <|im_start|>user
<image>
Describe this image.<|im_end|>
<|im_start|>assistant
✗ vision-language generation failed: unsupported operand type(s) for //: 'int' and 'NoneType'
Testing language-only generation...
==========
Image: None
Prompt: <|im_start|>user
Hi, how are you?<|im_end|>
<|im_start|>assistant
Hello! I'm just a computer program, I don't have feelings or physical sensations, so I don't have a "how I am." But I'm here to help you with any questions or tasks you may have. How can I assist you today?
==========
Prompt: 15 tokens, 4.075 tokens-per-sec
Generation: 59 tokens, 11.798 tokens-per-sec
Peak memory: 36.997 GB
✓ language-only generation successful
Cleaning up...
✓ Cleanup complete
50%|██████████████████████████████████████████▌ | 1/2 [00:20<00:20, 20.20s/it]╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Testing mlx-community/llava-v1.6-mistral-7b-8bit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Loading model...
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 19448.09it/s]
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 32896.50it/s]
✓ Model loaded successfully | 0/12 [00:00<?, ?it/s]
Testing vision-language generation...
==========
Image: ['/Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg']
Prompt: [INST] <image>
Describe this image. [/INST]
✗ vision-language generation failed: unsupported operand type(s) for //: 'int' and 'NoneType'
Testing language-only generation...
==========
Image: None
Prompt: [INST] Hi, how are you? [/INST]
Hello! I'm just a computer program, so I don't have feelings or emotions. Is there something I can help you with?
==========
Prompt: 14 tokens, 75.599 tokens-per-sec
Generation: 31 tokens, 60.094 tokens-per-sec
Peak memory: 8.080 GB
✓ language-only generation successful
Cleaning up...
✓ Cleanup complete
100%|█████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:23<00:00, 11.78s/it]
╭────────────────────────────────────────────────────── Results ───────────────────────────────────────────────────────╮
│ ✗ mlx-community/llava-v1.6-34b-8bit │
│ ✗ mlx-community/llava-v1.6-mistral-7b-8bit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Some models tested failed to test
╭───────────────────────────────────────────────── System Information ─────────────────────────────────────────────────╮
│ │
│ MAC OS: v15.2 │
│ Python: v3.12.7 │
│ MLX: v0.22.0 │
│ MLX-VLM: v0.1.10 │
│ Transformers: v4.48.0 │
│ │
│ Hardware: │
│ • Chip: Apple M4 Max │
│ • RAM: 128.0 GB │
│ • CPU Cores: 16 │
│ • GPU Cores: 40 │
│ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
You will need to install mlx-vlm from source and change the test_smoke.py file. (I will make the necessary change to display full traceback on the next release)
Instead, you could:
- run using mlx_vlm.generate in your terminal and get the full traceback.
- or install mlx as I suggested earlier and see if it solves your issue.
I have done a fuller traceback, and it seems that there issue is in transformers. I do seem to recall that there was a warning, running earlier versions.
python smoke_test.py --models-file llava.txt --image /Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg
0%| | 0/2 [00:00<?, ?it/s]╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Testing mlx-community/llava-v1.6-34b-8bit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Loading model...
Fetching 17 files: 100%|█████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 11987.76it/s]
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Fetching 17 files: 100%|█████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 10993.40it/s]
✓ Model loaded successfully | 0/17 [00:00<?, ?it/s]
Testing vision-language generation...
==========
Image: ['/Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg']
Prompt: <|im_start|>user
<image>
Describe this image.<|im_end|>
<|im_start|>assistant
✗ vision-language generation failed: unsupported operand type(s) for //: 'int' and 'NoneType'
Traceback (most recent call last):
File "/Users/jrp/Documents/AI/mlx/scripts/vlm/smoke_test.py", line 110, in test_generation
output = generate(**generate_args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1101, in generate
for response in stream_generate(model, processor, prompt, image, **kwargs):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1015, in stream_generate
inputs = prepare_inputs(
^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 812, in prepare_inputs
inputs = processor(
^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/transformers/models/llava_next/processing_llava_next.py", line 162, in __call__
num_image_tokens = self._get_number_of_features(orig_height, orig_width, height, width)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/transformers/models/llava_next/processing_llava_next.py", line 181, in _get_number_of_features
patches_height = height // self.patch_size
~~~~~~~^^~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for //: 'int' and 'NoneType'
Testing language-only generation...
==========
Image: None
Prompt: <|im_start|>user
Hi, how are you?<|im_end|>
<|im_start|>assistant
Hello! I'm an AI language model, so I don't have feelings or personal experience, but I'm always ready to assist you with any questions you may have. How can I help?
==========
Prompt: 15 tokens, 5.109 tokens-per-sec
Generation: 43 tokens, 11.431 tokens-per-sec
Peak memory: 36.997 GB
✓ language-only generation successful
Cleaning up...
✓ Cleanup complete
50%|██████████████████████████████████████████▌ | 1/2 [00:15<00:15, 15.92s/it]╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Testing mlx-community/llava-v1.6-mistral-7b-8bit │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Loading model...
Fetching 12 files: 100%|██████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 9843.86it/s]
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 27654.75it/s]
✓ Model loaded successfully | 0/12 [00:00<?, ?it/s]
Testing vision-language generation...
==========
Image: ['/Users/jrp/Pictures/Processed/20250104-211707_DSC01899.jpg']
Prompt: [INST] <image>
Describe this image. [/INST]
✗ vision-language generation failed: unsupported operand type(s) for //: 'int' and 'NoneType'
Traceback (most recent call last):
File "/Users/jrp/Documents/AI/mlx/scripts/vlm/smoke_test.py", line 110, in test_generation
output = generate(**generate_args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1101, in generate
for response in stream_generate(model, processor, prompt, image, **kwargs):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1015, in stream_generate
inputs = prepare_inputs(
^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 812, in prepare_inputs
inputs = processor(
^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/transformers/models/llava_next/processing_llava_next.py", line 162, in __call__
num_image_tokens = self._get_number_of_features(orig_height, orig_width, height, width)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/transformers/models/llava_next/processing_llava_next.py", line 181, in _get_number_of_features
patches_height = height // self.patch_size
~~~~~~~^^~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for //: 'int' and 'NoneType'
I see, you probably running the latest transformers.
Let me try something to see if it fixes it.
Try again now :)
No change., I'm afraid. Looking back at earlier runs, I see that there was a deprecation warning:
``
Expanding inputs for image tokens in LLaVa-NeXT should be done in processing. Please add patch_size and vision_feature_select_strategy to the model's processing config or set directly with processor.patch_size = {{patch_size}} and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.50.
I just fixed that
Download a fresh copy of the model weights :)
https://huggingface.co/mlx-community/llava-v1.6-mistral-7b-8bit/commit/b8df5f329d95a7abe6429ed46093f9b84e8e6396
Thanks. That seems to fix mlx-community/llava-v1.6-mistral-7b-8bit. Is the problem the same for mlx-community/llava-v1.6-34b-8bit?
My pleasure!
I think so
I will update all models.