idea: MLX support
Problem Statement
Jan does not currently support MLX as an inference engine. This limits compatibility for users on Apple Silicon who want to leverage MLX's optimized performance on local models.
User comments:
https://www.reddit.com/r/LocalLLaMA/comments/1lf5yog/comment/myq2e89/ https://www.reddit.com/r/LocalLLaMA/comments/1lf5yog/comment/mypm0yl/ https://www.reddit.com/r/LocalLLaMA/comments/1lf5yog/comment/mym0yax/
Feature Idea
Integrate MLX as a selectable inference backend, allowing users to run models directly using Apple's MLX stack. This would broaden Jan’s utility on macOS and align with Apple’s growing ecosystem.
Need both of your insight here also @qnixsynapse @gau-nernst
Once we finish with the llama.cpp extension, I think supporting MLX won't be too difficult. We already bundle uvx with Jan.
For now I think we should support backends supported by ggml for local inference. MLX supporting openai servers can already be supported as an external provider.