Add support for LM Studio server
Recently LM Studio has become more popular, and it's probably one of the best options for local model running on a macbook in terms of GUI. However, the other neat feature is that it has server support making it a nice all in one gateway for local LLMs. In particular, it also supports MLX as runtime for inference.
I have been using Tabby with vim through the builtin inference, but I would like to switch this to be through LM Studio. Tabby already supports Ollama, so if it's not a huge ask could you add support for something similar for LM Studio?
The LM Studio API's are OpenAI compatible, so hopefully this might end up being easy to setup.
Hello @radoslav11, if you use LM Studio and OpenAI compatible APIs, I think you can directly access LM Studio as an endpoint by configuring your local config.toml; it seems that Tabby doesn't need to do any extra operations? I tried using LM Studio with deepseek-r1-distill-qwen-7b by configuring it in config.toml.
[model.chat.http]
kind = "openai/chat"
model_name = "deepseek-r1-distill-qwen-7b"
api_endpoint = "http://localhost:1234/v1" #default port from LM studio
api_key = ""
It is feasible
By the way, we are still planning how long it will take to add support for the think tag, and we will also add documentation related to LM Studio later. Thanks for sharing
Thanks, that's neat - didn't realize passing the key as "" would work. Ideally I would like to try using the completions rather than the chat mode though. In particular, I tried to use the following:
[model.completion.http]
# actually it's mlx
kind = "opanai/completion"
model_name = "deepseek-coder-6.7b-base-mlx"
api_endpoint = "http://localhost:1234/v1"
api_key = ""
But presumably because by default OpenAI doesn't really have a completion API, tabby serve doesn't allow "opanai/completion" in the spec:
rado:~/ $ tabby serve --port 12345
⠹ 1.010 s Starting...The application panicked (crashed).
Message: Unsupported model kind for http completion: opanai/completion
...
Hi @radoslav11, I think this is a typo, tabby supports openai/completion and you seem to have misspelled it as opan instead of openai
with the same config is available
[model.completion.http]
kind = "openai/completion"
model_name = "deepseek-coder-6.7b-base-mlx"
api_endpoint = "http://localhost:1234/v1"
api_key = ""
Wow 🤦♂️ - indeed works, thanks a lot!
https://tabby.tabbyml.com/docs/references/models-http-api/lm-studio/