code_puppy Feature Request

Support for fully localized AI with ollama

Let me know if you want to collab.

Nov 17 '25 19:11 tjdodson

You can do this!!

Nov 18 '25 02:11 mpfaffenberger

Create a ~/.code_puppy/extra_models.json

and put this in:

{
   "qwen3-coder-30b": {
     "type": "custom_openai",
     "name": "Qwen3-Coder-30B-A3B-Instruct",
     "custom_endpoint": {
         "url": "http://localhost:11434",
     },
     "context_length": 256000
   }
}

Nov 18 '25 02:11 mpfaffenberger

I have tested this and ollama api contract requires /v1. Try this.

{ "qwen3-coder-30b": { "type": "custom_openai", "name": "qwen3-coder:30b", "custom_endpoint": { "url": "http://localhost:11434/v1/" }, "context_length": 256000 } }

Also note though that ollama has a default context of 4,096 from what i can find. You can increase it by setting environment variable OLLAMA_CONTEXT_LENGTH. The higher you set it though the more vram you must have.

Nov 24 '25 20:11 AndrewTilson

Interesting thank you!-TrevorOn Nov 24, 2025, at 2:08 PM, Andrew Tilson @.***> wrote:AndrewTilson left a comment (mpfaffenberger/code_puppy#112) I have tested this and ollama api contract requires /v1. Try this. { "qwen3-coder-30b": { "type": "custom_openai", "name": "qwen3-coder:30b", "custom_endpoint": { "url": "http://localhost:11434/v1/" }, "context_length": 256000 } } Also note though that ollama has a default context of 4,096 from what i can find. You can increase it by setting environment variable OLLAMA_CONTEXT_LENGTH. The higher you set it though the more vram you must have.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

Nov 25 '25 00:11 tjdodson

@tjdodson - I would highly recommend using LM Studio instead of Ollama. You'll have a much better experience and have faster / better inference.

Nov 27 '25 15:11 mpfaffenberger

Closing - as this is already supported

Dec 07 '25 12:12 mpfaffenberger