Run exo with deepseek-r1-distill-qwen-32b got 'Currently only supports float32' error
My system is: OS: Arch Linux x86_64 Linux 6.12.9-arch1-1 CPU: Intel Core i7-9700 @ 8x 4.7GHz GPU: NVIDIA Corporation GK208B [GeForce GT 730] RAM: 31909MiB
I run the command 'exo --inference-engine mlx run deepseek-r1-distill-qwen-32b' command, and it has downloaded the mlx-community--DeepSeek-R1-Distill-Qwen-32B-4bit automatically. But I got the below error:
How can I fix this issue?
By the way I only see Llama models there, no other models, is there some config item to add other models?
I have same problem, how to resolve this question
same here
i thought mlx inference was mac silicon only? i dont think this is a mac, also the only models available to people without mac setups are the lama models.
if you have a mac it shows a lot of models, non mac only shows lama.
not sure if its possible to run other models on a non mac setup.
its also a bit of a trick, when i run mac as my host my linux machines do not download nor use the models i download using the mac, so it requires an all mac setup to run decent stuff, so while i can run multiple machines to load models, it only loads lama models unless its an all mac setup, in which case it can load many models.
i thought it would work with mac + linux, but it only works for lama models, and from my experience they are better run on a single computer.