Support lmstudio
Please explain the motivation behind the feature request. Is it possible to support LMStudio natively? LMStudio supports MLX formats that works better on Mac. Also it supports switch models without restart the service.
Describe the solution you'd like LMStudio has the services mode that listen on port 12345. Should be easy to write prompt to that port. But the support should be able to handle the model list under the same provider so users can switch models on the fly.
- [x] I have verified this does not duplicate an existing feature request
It works with LM Studio. Configure goose to set OPENAI_HOST to http://127.0.0.1:1234
goose configure
This will update your existing config file
if you prefer, you can edit it directly at /Users/rozgo/.config/goose/config.yaml
┌ goose-configure
│
◇ What would you like to configure?
│ Configure Providers
│
◇ Which model provider should we use?
│ OpenAI
│
● OPENAI_API_KEY is set via environment variable
│
◇ Would you like to save this value to your keyring?
│ Yes
│
● Saved OPENAI_API_KEY to config file
│
● OPENAI_HOST is already configured
│
◇ Would you like to update this value?
│ Yes
│
◇ Enter new value for OPENAI_HOST
│ http://127.0.0.1:1234
│
● OPENAI_BASE_PATH is already configured
│
◇ Would you like to update this value?
│ No
│
◇ Model fetch complete
│
◆ Select a model:
│ ○ qwen3-30b-a3b
│ ○ qwen3-30b-a3b-mlx
│ ● qwen3-30b-a3b-mlx@8bit
│ ○ text-embedding-nomic-embed-text-v1.5
└
It's awesome that it works, but it would be great if the docs mentioned it among supported providers. I found this issue when seeing someone on Reddit say it was unsupported and I suspected that wasn't true
Because LM Studio is OpenAI-compatible, that's why it happens to work.
There could be a lot of other OpenAI-compatible providers, I'm not sure if it is feasible to list them all
There could be a lot of other OpenAI-compatible providers, I'm not sure if it is feasible to list them all
That's fair, but OTOH the docs specifically mention Ollama as a provider, and if you're looking to run models locally on a Mac, the two obvious options are Ollama and LM Studio, of which the latter is (for most probably) easier to manage. 🤷♂️
(Perhaps the Goose<->Ollama interface is not using the OpenAI-compatible API though? I haven't checked and I guess that could account for Ollama's special-casing. Anyhow...)
It doesn't work for me with the OpenAI provider...
2025-08-06 11:07:51 [INFO]
[JIT] Requested model (qwen3-32b-mlx) is not loaded. Loading "lmstudio-community/Qwen3-32B-MLX-8bit" now...
2025-08-06 11:07:59 [INFO]
[LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2025-08-06 11:07:59 [INFO]
[LM STUDIO SERVER] Streaming response...
2025-08-06 11:07:59 [ERROR]
The number of tokens to keep from the initial prompt is greater than the context length. Try to load the model with a larger context length, or provide a shorter input. Error Data: n/a, Additional Data: n/a
Goose installed on macOS via Homebrew.
The number of tokens to keep from the initial prompt is greater than the context length. Try to load the model with a larger context length, or provide a shorter input. Error Data: n/a, Additional Data: n/a
The number of tokens to keep from the initial prompt is greater than the context length. Try to load the model with a larger context length, or provide a shorter input. Error Data: n/a, Additional Data: n/a
Left UI menu: "My Models" For the Model in question, right side of the page "Edit Model Default Configs" On the "Load" tab, "Context Length" needs to be larger than what goose is sending. Probably 10,000 gets you past this error, and equally likely, you should configure that using something other than witchcraft.
FWIW I experienced great responsiveness via the LM Studio chat, but poor delivery speed of responses back into goose. I stopped using it and went back to ollama, as I seem to have more success there, albeit with different issues.
my lmstudio Version 0.3.23 (0.3.23) my goose-cli version 1.4.0
HIH.
closing in favor of #4197