mlx-llm-server icon indicating copy to clipboard operation
mlx-llm-server copied to clipboard

Benchmarks?

Open krzysiekpodk opened this issue 1 year ago • 1 comments

Do you have some benchmarks against llama.cpp?

krzysiekpodk avatar Feb 20 '24 11:02 krzysiekpodk

It will be slower than llama.cpp, given that MLX is a general machine learning framework and not specialized for LLM inference. However, MLX is actively working on improving performance; I believe it will improve significantly in the future.

mzbac avatar Feb 20 '24 14:02 mzbac