exo Run exo with deepseek-r1-distill-qwen-32b got 'Currently only supports float32' error

My system is: OS: Arch Linux x86_64 Linux 6.12.9-arch1-1 CPU: Intel Core i7-9700 @ 8x 4.7GHz GPU: NVIDIA Corporation GK208B [GeForce GT 730] RAM: 31909MiB

I run the command 'exo --inference-engine mlx run deepseek-r1-distill-qwen-32b' command, and it has downloaded the mlx-community--DeepSeek-R1-Distill-Qwen-32B-4bit automatically. But I got the below error:

How can I fix this issue?

By the way I only see Llama models there, no other models, is there some config item to add other models?

Feb 13 '25 01:02 nuaadupuliu

I have same problem, how to resolve this question

Feb 14 '25 08:02 DeveloperStark

same here

Feb 18 '25 05:02 tl2267699710

i thought mlx inference was mac silicon only? i dont think this is a mac, also the only models available to people without mac setups are the lama models.

if you have a mac it shows a lot of models, non mac only shows lama.

not sure if its possible to run other models on a non mac setup.

its also a bit of a trick, when i run mac as my host my linux machines do not download nor use the models i download using the mac, so it requires an all mac setup to run decent stuff, so while i can run multiple machines to load models, it only loads lama models unless its an all mac setup, in which case it can load many models.

i thought it would work with mac + linux, but it only works for lama models, and from my experience they are better run on a single computer.

Feb 18 '25 06:02 woolcoxm