ollama MLX backend

MLX backend

Open ageorgios opened this issue 1 year ago • 64 comments

Can ollama be converted to use MLX from Apple as backend for the models ?

Dec 27 '23 20:12 ageorgios

This Please!

Dec 31 '23 09:12 Josecodesalot

What do you hope to gain from this? I don't think MLX is faster for inference, at least not yet.

Jan 02 '24 20:01 easp

Found these benchmarks: https://medium.com/@andreask_75652/benchmarking-apples-mlx-vs-llama-cpp-bbbebdc18416

Seems like MLX is indeed slower than the llama.cpp masterpiece, at least for now. I did not verify though.

Jan 10 '24 04:01 KernelBypass

This would be very nice! and not only for text generation, Image/Multimodal would be boosted too.

Jan 23 '24 03:01 Edu126

someone made this https://github.com/kspviswa/PyOMlx

Apr 20 '24 12:04 JimmyLv

Ollama is awesome and does so many things and some of us want to play with mlx models.

May 04 '24 04:05 magnusviri

bump

May 30 '24 03:05 angelo-cortez

Commenting here to say we're aware of MLX. I've been working on a prototype but I can't give an ETA at for MLX support at this time

May 30 '24 05:05 mxyng

Related to this Apple CoreML support to utilize Apple Neural Engine (ANE) alongside GPU & CPU: https://github.com/ollama/ollama/issues/3898

Jun 28 '24 20:06 qdrddr