Cannot load gpt-oss-20b model in Playground
- In the extensions catalog download the gptt-oss-20b model locally
- Go to Playgrouns and select the downloaded model
Result is a error:
Failed loading model:gpt-oss-20b-cuda-gpu error: [E:_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1836 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : Error loading "c:\Users\smithasa.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.18.3-win32-x64\bin\onnxruntime_providers_cuda.dll" which is missing. (Error 126: "The specified module could not be found.") , at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x54 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx.LoadModelAsync(String, String, String, CancellationToken) + 0x773 at Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderBase`1.<EnsureModelLoadedAsync>d__44.MoveNext() + 0x6a3] 2025-08-08 17:37:00.291 [info] Information: Microsoft.Neutron.OpenAI.Provider.OpenAIServiceProviderOnnx [1401] 2025-08-08T17:37:00.2875194-07:00 Finish loading model:gpt-oss-20b-cuda-gpu elapsed time:00:00:00.3099748 2025-08-08 17:37:00.312 [error] Failed loading model gpt-oss-20b-cuda-gpu. E:_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1836 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : Error loading "c:\Users\smithasa.vscode\extensions\ms-windows-ai-studio.windows-ai-studio-0.18.3-win32-x64\bin\onnxruntime_providers_cuda.dll" which is missing. (Error 126: "The specified module could not be found.")
Could you check if the following path exists: C:\Users<user name>.aitk\bin\libonnxruntime_cuda_windows\0.0.3?
BTW, please also check if you have a nvidia gpu on your device.
Same issue, wired. Ollama can run the model, but this toolkit can't even detect it.
Added:
ollama list [~] NAME ID SIZE MODIFIED gpt-oss:20b aa4295ac10c3 13 GB 28 hours ago
Added:
ollama list [~] NAME ID SIZE MODIFIED gpt-oss:20b aa4295ac10c3 13 GB 28 hours ago
![]()
Hi @warm3snow, this is a different issue.
I tried ollama's gpt-oss model and it can show up in AI Toolkit.
AI Toolkit checks Ollama models by calling its RESTful API from OLLAMA_HOST environment variable or defaults to http://localhost:11434. Could you check if you have set OLLAMA_HOST environment variable to another instance of Ollama or run different instances of Ollama, for example, running AI Toolkit in Windows and Ollama in WSL.