Does not work with LMstudio or Jan.ai locally
Trying to run locally with LMstudio or Jan.ai
2025-09-15 14:19:13.126 [info] Loading View: modelPlayground 2025-09-15 14:19:18.807 [error] Failed to chatStream. model = "Custom/Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf", errorMessage = "Error: Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct.", errorType = "u", errorObject = {"innerError":{"status":404,"headers":{},"requestID":null}} 2025-09-15 14:19:18.807 [error] Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct. 404 Not Found 2025-09-15 14:20:03.404 [error] Failed to chatStream. model = "Custom/Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf", errorMessage = "Error: Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct.", errorType = "u", errorObject = {"innerError":{"status":404,"headers":{},"requestID":null}} 2025-09-15 14:20:03.405 [error] Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct. 404 Not Found 2025-09-15 14:20:26.870 [info] Information: Microsoft.Neutron.Telemetry.TelemetryLogger [0] 2025-09-15T14:20:26.8689352+03:00 UserAgent: Command:ListLoadedModels Status:Success Direct:True Time:0ms
Jan.ai config.
http://localhost:1337/v1 Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf Apikey ***
LMstudio config. http://192.168.1.130:1234 Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0 Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0 Apikey not set as LMstudio has no such key
Hi @akierum, could you double check if you set the custom model chat completion endpoint with full URL, something like http://192.168.1.130:1234/v1/chat/completions.
Yes I tried everything does not work. The above paths that work for Jan.ai config. or LMstudio config. is working fine on same PC, the only different thing is plugin in this case vscode-ai-toolkit does not work.
I tried the http://192.168.1.130:1234/v1/chat/completions. LMstudio gives this error 2025-09-16 22:23:43 [ERROR] Unexpected endpoint or method. (POST /). Returning 200 anyway
I can successfully run AI Toolkit playground with LM Studio 0.3.13. My settings:
Endpoint: http://localhost:1234/v1/chat/completions
Model name: qwen3-0.6b
Auth token: unused
Could you first try with curl to verify LM Studio work correctly? Something like:
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-0.6b",
"messages": [
{ "role": "system", "content": "Always answer in rhymes. Today is Thursday" },
{ "role": "user", "content": "What day is it today?" }
],
"temperature": 0.7,
"max_tokens": -1,
"stream": false
}'
I tried the http://192.168.1.130:1234/v1/chat/completions. LMstudio gives this error 2025-09-16 22:23:43 [ERROR] Unexpected endpoint or method. (POST /). Returning 200 anyway
In the meantime, I see your error message suggests it actually sends the request to URL path / (because of (POST /). Returning 200 anyway) and I also see the hostname in your URL is 192.168.1.130 instead of localhost so I assume you are accessing LM Studio through reverse proxy or firewalls? Could you try to check whether your forwarding is set up correctly? It seems it doesn't correctly forward the URL path. (It changes /v1/chat/completions to /)
curl http://localhost:1234/v1/chat/completions
StatusCode : 200 StatusDescription : OK Content : {"error":"Unexpected endpoint or method. (GET /v1/chat/completions)"} RawContent : HTTP/1.1 200 OK Connection: keep-alive Keep-Alive: timeout=5 Content-Length: 69 Content-Type: application/json; charset=utf-8 Date: Thu, 18 Sep 2025 12:46:41 GMT ETag: W/"45-BMhsXJJPQVg8zzNDNDFE... Forms : {} Headers : {[Connection, keep-alive], [Keep-Alive, timeout=5], [Content-Length, 69], [Content-Type, application/json; charset=utf-8]...} Images : {} InputFields : {} Links : {} ParsedHtml : mshtml.HTMLDocumentClass RawContentLength : 69
I had made it work.
- I had to load the model in LMstudio manually.
- I had to run curl http://localhost:1234/api/v0/models This shows what name the model uses
- I had to use the http://localhost:1234/v1/chat/completions
- qwen3-coder-30b-a3b-instruct-480b-distill-v2
- qwen3-coder-30b-a3b-instruct-480b-distill-v2
- *** (anything as API key)
But please automate this like cline etc. This is nonsense to manually use it like this.