instructor icon indicating copy to clipboard operation
instructor copied to clipboard

Implement instructor for MLX support to interact with LLM on Apple platforms (M1/M2/M3)

Open matiasdev30 opened this issue 6 months ago • 4 comments

Is your feature request related to a problem? Please describe.

I'm interested in running LLMs locally on Apple Silicon (M1/M2/M3) using Instructor, but currently the library only supports OpenAI and compatible APIs. There is no native support for Apple's MLX framework, which is optimized for these devices. As a result, it's not possible to fully leverage the privacy, speed, and cost benefits of running LLMs directly on Mac hardware using Instructor.

Describe the solution you'd like

I'd like to see Instructor support MLX as a backend for model inference. This could be implemented as a new client or adapter, allowing users to pass prompts and receive structured outputs from locally hosted LLMs (such as Llama, Mistral, or Phi models running via MLX) in the same way they would with OpenAI. Ideally, the API would remain consistent, just swapping the backend.

Describe alternatives you've considered

I've considered using other frameworks or creating custom wrappers for MLX, but none offer the seamless, schema-driven and robust structured output experience Instructor provides. Other projects like Toolio are exploring MLX agents, but they don't have the same Pythonic interface or validation features.

Additional context

  • Apple MLX repo: https://github.com/apple/mlx
  • Example of LLM inference with MLX: https://github.com/ml-explore/mlx-examples
  • This would make Instructor even more useful for privacy-conscious and offline-first applications, especially for Mac users.
  • If needed, I'm happy to help test or provide feedback on this feature!

matiasdev30 avatar Jun 20 '25 12:06 matiasdev30

Is it possible to run something like this via ollama or Llama Cpp?

jxnl avatar Jun 20 '25 17:06 jxnl

Is it possible to run something like this via ollama or Llama Cpp?

Is possible the problem is the arch, apple plantaforn is another arch, the pc go crazy the CPU, the memory, because don't interact well with the llm

matiasdev30 avatar Jun 22 '25 13:06 matiasdev30

@jxnl I can work in this feature to implemente MLX for Apple devices use better instructor

matiasdev30 avatar Jun 23 '25 09:06 matiasdev30

Am also interested in this:

  • Ollama uses llama.cpp - which uses Apple Metal
  • MLX from Apple uses CoreML - which uses a combination of the Apple Neural Engine and the GPU

For some models, MLX can be faster - and it would be great to use a single tool like instructor to run apples-to-apples (pun intended 🍎 ) comparisons.

can do the integration with https://github.com/ml-explore/mlx-lm - which may be easier to use.

phlogisticfugu avatar Oct 07 '25 21:10 phlogisticfugu