goose icon indicating copy to clipboard operation
goose copied to clipboard

Support lmstudio

Open vrealzhou opened this issue 7 months ago • 4 comments

Please explain the motivation behind the feature request. Is it possible to support LMStudio natively? LMStudio supports MLX formats that works better on Mac. Also it supports switch models without restart the service.

Describe the solution you'd like LMStudio has the services mode that listen on port 12345. Should be easy to write prompt to that port. But the support should be able to handle the model list under the same provider so users can switch models on the fly.

  • [x] I have verified this does not duplicate an existing feature request

vrealzhou avatar May 19 '25 03:05 vrealzhou

It works with LM Studio. Configure goose to set OPENAI_HOST to http://127.0.0.1:1234

goose configure

This will update your existing config file
  if you prefer, you can edit it directly at /Users/rozgo/.config/goose/config.yaml

┌   goose-configure 
│
◇  What would you like to configure?
│  Configure Providers 
│
◇  Which model provider should we use?
│  OpenAI 
│
●  OPENAI_API_KEY is set via environment variable
│  
◇  Would you like to save this value to your keyring?
│  Yes 
│
●  Saved OPENAI_API_KEY to config file
│  
●  OPENAI_HOST is already configured
│  
◇  Would you like to update this value?
│  Yes 
│
◇  Enter new value for OPENAI_HOST
│  http://127.0.0.1:1234
│
●  OPENAI_BASE_PATH is already configured
│  
◇  Would you like to update this value?
│  No 
│
◇  Model fetch complete
│
◆  Select a model:
│  ○ qwen3-30b-a3b 
│  ○ qwen3-30b-a3b-mlx 
│  ● qwen3-30b-a3b-mlx@8bit 
│  ○ text-embedding-nomic-embed-text-v1.5 
└  

rozgo avatar May 21 '25 02:05 rozgo

It's awesome that it works, but it would be great if the docs mentioned it among supported providers. I found this issue when seeing someone on Reddit say it was unsupported and I suspected that wasn't true

PaluMacil avatar May 22 '25 04:05 PaluMacil

Because LM Studio is OpenAI-compatible, that's why it happens to work.

There could be a lot of other OpenAI-compatible providers, I'm not sure if it is feasible to list them all

brianhuster avatar May 30 '25 06:05 brianhuster

There could be a lot of other OpenAI-compatible providers, I'm not sure if it is feasible to list them all

That's fair, but OTOH the docs specifically mention Ollama as a provider, and if you're looking to run models locally on a Mac, the two obvious options are Ollama and LM Studio, of which the latter is (for most probably) easier to manage. 🤷‍♂️

(Perhaps the Goose<->Ollama interface is not using the OpenAI-compatible API though? I haven't checked and I guess that could account for Ollama's special-casing. Anyhow...)

gimbo avatar Jun 07 '25 07:06 gimbo

It doesn't work for me with the OpenAI provider...

2025-08-06 11:07:51  [INFO]
 [JIT] Requested model (qwen3-32b-mlx) is not loaded. Loading "lmstudio-community/Qwen3-32B-MLX-8bit" now...
2025-08-06 11:07:59  [INFO]
 [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2025-08-06 11:07:59  [INFO]
 [LM STUDIO SERVER] Streaming response...
2025-08-06 11:07:59 [ERROR]
 The number of tokens to keep from the initial prompt is greater than the context length. Try to load the model with a larger context length, or provide a shorter input. Error Data: n/a, Additional Data: n/a

Goose installed on macOS via Homebrew.

taoeffect avatar Aug 06 '25 18:08 taoeffect

The number of tokens to keep from the initial prompt is greater than the context length. Try to load the model with a larger context length, or provide a shorter input. Error Data: n/a, Additional Data: n/a

The number of tokens to keep from the initial prompt is greater than the context length. Try to load the model with a larger context length, or provide a shorter input. Error Data: n/a, Additional Data: n/a

Left UI menu: "My Models" For the Model in question, right side of the page "Edit Model Default Configs" On the "Load" tab, "Context Length" needs to be larger than what goose is sending. Probably 10,000 gets you past this error, and equally likely, you should configure that using something other than witchcraft.

FWIW I experienced great responsiveness via the LM Studio chat, but poor delivery speed of responses back into goose. I stopped using it and went back to ollama, as I seem to have more success there, albeit with different issues.

my lmstudio Version 0.3.23 (0.3.23) my goose-cli version 1.4.0

HIH.

ewann avatar Aug 26 '25 06:08 ewann

closing in favor of #4197

DOsinga avatar Oct 11 '25 19:10 DOsinga