Which specific models work with this framework?
This is a nice framework to use for image analysis / captioning, etc.
Is there a doc somewhere that sets out which models, specifically can be driven through this app/library? When you say "Pixtral", eg, which of the versions should work (without further conversion, on what size of machine)?
I know that you say that Lava is no longer state of the art, but what is better?
Thanks.
Otherwise I get errors like
(mlx) ➜ mlx_vlm git:(main) ✗ python mytest.py
Fetching 3 files: 100%|█████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 6772.29it/s]
ERROR:root:Config file not found in /Users/jrp/.cache/huggingface/hub/models--mistralai--Pixtral-12B-2409/snapshots/df119bf36c0cedc6ffdc9ca6c58ebf51f9771ef7
Traceback (most recent call last):
File "/Users/zzz/Documents/AI/mlx/mlx-vlm/mlx_vlm/mytest.py", line 12, in <module>
model, processor = load(model_path)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
model = load_model(model_path, lazy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 116, in load_model
config = load_config(model_path)
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 268, in load_config
with open(model_path / "config.json", "r") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/Users/zzz/.cache/huggingface/hub/models--mistralai--Pixtral-12B-2409/snapshots/df119bf36c0cedc6ffdc9ca6c58ebf51f9771ef7/config.json'
@jrp2014 good question!
In general you can find the correct models in the mlx-community repo. They are usually converted and uploaded there before the release.
We currently support the Pixtral version from the mistral-community. This version is formatted like llava.
https://huggingface.co/mistral-community/pixtral-12b
Thanks. I don't find the search function on hugging face particularly easy to use.
Not sure what's going wrong here:
import mlx.core as mx
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config
# Load the model
model_path = "mistral-community/pixtral-12b"
model, processor = load(model_path)
config = load_config(model_path)
# Prepare input
image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
prompt = "Describe this image."
# Apply chat template
formatted_prompt = apply_chat_template(
processor, config, prompt, num_images=len(image)
)
# Generate output
output = generate(model, processor, image, formatted_prompt, verbose=False)
print(output)
results in
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 28688.81it/s]
Traceback (most recent call last):
File "/Users/jrp/Documents/AI/mlx/mlx-vlm/mlx_vlm/mytest3.py", line 8, in <module>
model, processor = load(model_path)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
model = load_model(model_path, lazy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
model = model_class.Model(model_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
self.vision_tower = VisionModel(config.vision_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: pixtral
This is been run from the latest mlx_vlm directory
Install from source.
I recently merged a PR fixing all the bugs
yes, that's what I am doing.
pip install git+https://github.com/Blaizzy/mlx-vlm.git
Uninstall and reinstall from source.
It seems you have an older version.
Check the version you have installed.
Let me know if the issue persists with version 0.1.0
Is there a way of checking what version is being run from the python script?
Successfully built mlx-vlm
Installing collected packages: mlx-vlm
Successfully installed mlx-vlm-0.1.0
Fails as above.
Try
pip list | grep mlx
Can you try to run this in your terminal
python -m mlx_vlm.generate --model mistral-community/pixtral-12b --max-tokens 100 --temp 0.0 --prompt
"What animal is this?"
Still no, go, I'm afraid. No doubt it is something about my setup, but I can't see what it could be; it's built straight from a clone of your GitHub repository.
python -m mlx_vlm.generate --model mistral-community/pixtral-12b --max-tokens 100 --temp 0.0 --prompt 'What animal is this?'
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 35226.52it/s]
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 96, in <module>
main()
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 73, in main
model, processor, image_processor, config = get_model_and_processors(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/generate.py", line 61, in get_model_and_processors
model, processor = load(
^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
model = load_model(model_path, lazy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
model = model_class.Model(model_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
self.vision_tower = VisionModel(config.vision_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: pixtral
Please share the result of
pip list | grep mlx
lightning-whisper-mlx 0.0.10
mlx 0.18.1.dev20241011+c21331d4
mlx-data 0.0.2
mlx-lm 0.19.1
mlx-vlm 0.1.0
mlx-whisper 0.3.0
Try this model and let me know if the issue persists.
mlx-community/pixtral-12b-8bit
Something doesn't add up because your logs are saying the model is loading using llava arch instead of pixtral.
I will give it a look.
Try this model and let me know if the issue persists.
mlx-community/pixtral-12b-8bit
Well this one doesn't just crash out, but it just spins, without producing an answer, either from the command line or via the script above.
python -m mlx_vlm.generate --model mlx-community/pixtral-12b-8bit --max-tokens 100 --temp 0.0
Fetching 11 files: 100%|█████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 44706.73it/s]
==========
Image: ['http://images.cocodataset.org/val2017/000000039769.jpg']
Prompt: <s>[INST]What are these?[IMG][/INST]
Found the issue!
This version points to llava in the model config. I patched it locally.
Don't worry, I will add a condition to fix this at load time.
https://huggingface.co/mistral-community/pixtral-12b/blob/main/config.json
Well this one doesn't just crash out, but it just spins, without producing an answer, either from the command line or via the script above.
What are the specs of your machine?
Try to pass --resize-shape 128 128 or --resize-shape 224 224
Also try the 4bit version instead of the 8bit.
mlx-community/pixtral-12b-4bit
Found the issue! This version points to llava in the model config. I patched it locally. Don't worry, I will add a condition to fix this at load time.
On second thought, I don't think it's a good idea to add a condition for one model.
You can use all models already converted in mlx-community repo (4bit, 8bit and bf16). Otherwise, to use the mistral-community model, you just have to change the config.json model_type from llava to pixtral.
OK. Thanks. It'd be good to document some of these points up front as the connection between the model names used here and the various hugging face repositories is a little tenuous, for new users.
Could you help me with that ?
Also, perhaps adding a way to scan for models on the mlx-community based on names ?
Sorry, but the models are too big for me to download and test comprehensively. I suggest that when you put up a new model type you give an example of the model that you used to test the addition. Also you could just point to the hugging face models that you have put up.
(My setup now seems to work again, staring from a fresh clone. Perhaps I shouldn't use iCloud to transfer my files between machines.)
But with the Mistral repo, which now has a config file, when replacing the model_type with llava, I still get
> python mytest.py
Fetching 15 files: 100%|█████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 15352.50it/s]
Traceback (most recent call last):
File "/Users/xxx/Documents/AI/mlx/scripts/vlm/mytest.py", line 19, in <module>
model, processor = load(model_path)
^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 251, in load
model = load_model(model_path, lazy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 189, in load_model
model = model_class.Model(model_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/llava.py", line 61, in __init__
self.vision_tower = VisionModel(config.vision_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/models/llava/vision.py", line 232, in __init__
raise ValueError(f"Unsupported model type: {self.model_type}")
ValueError: Unsupported model type: llava
Closing stale