Ollama + Qwen3-coder Error: does not support thinking
I use Ollama on Macbook Pro, and tried qwen2.5-coder:1.5b, as well as modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest.
The content in ~/.claude-code-router/config.json:
{
"PORT": 3456,
"Providers": [
{
"name": "ollama",
"api_base_url": "http://localhost:11434/v1/chat/completions",
"api_key": "ollama",
"models": ["qwen2.5-coder:1.5b", "modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest"]
}
],
"Router": {
"default": "ollama,modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest",
"background": "ollama,modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest",
"think": "ollama,modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest",
"longContext": "ollama,modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest",
"longContextThreshold": 60000,
"webSearch": "ollama,modelscope.cn/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:latest"
}
}
And when start with ccr code and type my instructions:
API Error: 400 {"error":{"message":"Error from provider(ollama,qwen2.5-coder:1.5b: 400):
{\"error\":{\"message\":\"\\\"qwen2.5-coder:1.5b\\\" does not support
thinking\",\"type\":\"api_error\",\"param\":null,\"code\":null}}\nError: Error from provider(ollama,qwen2.5-coder:1.5b:
400): {\"error\":{\"message\":\"\\\"qwen2.5-coder:1.5b\\\" does not support
thinking\",\"type\":\"api_error\",\"param\":null,\"code\":null}}\n\n at nt
(/opt/homebrew/lib/node_modules/@musistudio/claude-code-router/dist/cli.js:79940:11)\n at h0
(/opt/homebrew/lib/node_modules/@musistudio/claude-code-router/dist/cli.js:79998:11)\n at
process.processTicksAndRejections (node:internal/process/task_queues:105:5)\n at async l0 (/opt/homebrew/lib/node_modul
es/@musistudio/claude-code-router/dist/cli.js:79965:96)","type":"api_error","code":"provider_response_error"}}
My question is:
- I already knew that I can press Tab and disable thinking mode, and it works.
- It is very slow on my M3 Macbook Pro, even for qwen2.5-coder:1.5B. It seems like to restart ollama for each claude code instruction?
- What is the best practice? What model is best suit for M3 Macbook Pro + Claude Code?
Thanks very much.
Same issue here. it worked for me for couple of prompts. but then it got stuck midway and it started failing with same error. I tried other models also (but same result (like GLM 4.5 Air and devstral). when i disabled thinking then i get same errors but for tools (does not support tools)
Hmm i got it working with non hf model. like qwen3 from ollama lib (non gguf)
Okay it seems to be more complicated. i have tried different models. qwen-coder work, GLM 4.5 air works. None of the REAP versions worked. many finetunes did not work. soon i shall test UD quants and qwen3 x deepseek distill.
NB! I started using ollama lib models (via search you can find more custom ones also. I didn't know that before). I have had more luck with those.
it seems to be quite hard to find working models.
It is almost working :D I wish i would have more resources so i could test more :/ GLM 4.5 air at Q2 and Q3 were too slow for me and REAP did not work.. so cannot use that nor shall i test any higher param models.
I switched to LocalAI for now but still having issues. (even tho different issues). I shall probably try out LM Studio also (even tho I do not like that it is closed source)