vscode-ai-toolkit Does not work with LMstudio or Jan.ai locally

Trying to run locally with LMstudio or Jan.ai

2025-09-15 14:19:13.126 [info] Loading View: modelPlayground 2025-09-15 14:19:18.807 [error] Failed to chatStream. model = "Custom/Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf", errorMessage = "Error: Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct.", errorType = "u", errorObject = {"innerError":{"status":404,"headers":{},"requestID":null}} 2025-09-15 14:19:18.807 [error] Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct. 404 Not Found 2025-09-15 14:20:03.404 [error] Failed to chatStream. model = "Custom/Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf", errorMessage = "Error: Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct.", errorType = "u", errorObject = {"innerError":{"status":404,"headers":{},"requestID":null}} 2025-09-15 14:20:03.405 [error] Unable to call the Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf inference endpoint due to 404. Please check if the input or configuration is correct. 404 Not Found 2025-09-15 14:20:26.870 [info] Information: Microsoft.Neutron.Telemetry.TelemetryLogger [0] 2025-09-15T14:20:26.8689352+03:00 UserAgent: Command:ListLoadedModels Status:Success Direct:True Time:0ms

Jan.ai config.

http://localhost:1337/v1 Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0.gguf Apikey ***

LMstudio config. http://192.168.1.130:1234 Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0 Qwen3-30B-A3B-Instruct-Coder-480B-Distill-v2-Q8_0 Apikey not set as LMstudio has no such key

Sep 15 '25 11:09 akierum

Hi @akierum, could you double check if you set the custom model chat completion endpoint with full URL, something like http://192.168.1.130:1234/v1/chat/completions.

Sep 16 '25 07:09 a1exwang

Yes I tried everything does not work. The above paths that work for Jan.ai config. or LMstudio config. is working fine on same PC, the only different thing is plugin in this case vscode-ai-toolkit does not work.

Sep 16 '25 17:09 akierum

I tried the http://192.168.1.130:1234/v1/chat/completions. LMstudio gives this error 2025-09-16 22:23:43 [ERROR] Unexpected endpoint or method. (POST /). Returning 200 anyway

Sep 16 '25 19:09 akierum

I can successfully run AI Toolkit playground with LM Studio 0.3.13. My settings:

Endpoint: http://localhost:1234/v1/chat/completions Model name: qwen3-0.6b Auth token: unused

Could you first try with curl to verify LM Studio work correctly? Something like:

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-0.6b",
    "messages": [
      { "role": "system", "content": "Always answer in rhymes. Today is Thursday" },
      { "role": "user", "content": "What day is it today?" }
    ],
    "temperature": 0.7,
    "max_tokens": -1,
    "stream": false
}'

Sep 17 '25 03:09 a1exwang

I tried the http://192.168.1.130:1234/v1/chat/completions. LMstudio gives this error 2025-09-16 22:23:43 [ERROR] Unexpected endpoint or method. (POST /). Returning 200 anyway

In the meantime, I see your error message suggests it actually sends the request to URL path / (because of (POST /). Returning 200 anyway) and I also see the hostname in your URL is 192.168.1.130 instead of localhost so I assume you are accessing LM Studio through reverse proxy or firewalls? Could you try to check whether your forwarding is set up correctly? It seems it doesn't correctly forward the URL path. (It changes /v1/chat/completions to /)

Sep 17 '25 03:09 a1exwang

curl http://localhost:1234/v1/chat/completions

StatusCode : 200 StatusDescription : OK Content : {"error":"Unexpected endpoint or method. (GET /v1/chat/completions)"} RawContent : HTTP/1.1 200 OK Connection: keep-alive Keep-Alive: timeout=5 Content-Length: 69 Content-Type: application/json; charset=utf-8 Date: Thu, 18 Sep 2025 12:46:41 GMT ETag: W/"45-BMhsXJJPQVg8zzNDNDFE... Forms : {} Headers : {[Connection, keep-alive], [Keep-Alive, timeout=5], [Content-Length, 69], [Content-Type, application/json; charset=utf-8]...} Images : {} InputFields : {} Links : {} ParsedHtml : mshtml.HTMLDocumentClass RawContentLength : 69

Sep 18 '25 12:09 akierum

I had made it work.

I had to load the model in LMstudio manually.
I had to run curl http://localhost:1234/api/v0/models This shows what name the model uses
I had to use the http://localhost:1234/v1/chat/completions
qwen3-coder-30b-a3b-instruct-480b-distill-v2
qwen3-coder-30b-a3b-instruct-480b-distill-v2
*** (anything as API key)

But please automate this like cline etc. This is nonsense to manually use it like this.

Sep 18 '25 13:09 akierum