Test command defaults to OpenAI Response API on v0.0.8
You closed the issue: https://github.com/microsoft/poml/issues/105
I have just updated to 0.0.8, so in "poml.languageModel.apiUrl," "poml.languageModel.apiKey," and "poml.languageModel.apiVersion: 'gpt-4.1'," I set all to access my GitHub Pro subscription model. As proof that everything works fine, here is a screenshot with the same settings using the FastAgent Python package.
Here is the error output when testing the POML.
2025-08-25 14:26:33.139 [info] Testing prompt with chat model: e:\Projects\BetfairAiTrading\docs\POML\ActiveBetfairMarketPrompt.poml
2025-08-25 14:26:33.154 [info] [Request parameters] {"model":{"specificationVersion":"v2","supportedUrls":{"image/*":[{}]},"modelId":"gpt-4o","config":{"provider":"openai.responses","fileIdPrefixes":["file-"]}},"maxRetries":0,"temperature":0.5}
2025-08-25 14:26:33.530 [error] AI_APICallError: Not Found
at c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:67667:14
at processTicksAndRejections (node:internal/process/task_queues:105:5)
at postToApi (c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:67505:28)
at OpenAIResponsesLanguageModel.doStream (c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:94360:50)
at fn (c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:37842:27)
at c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:34525:22
at _retryWithExponentialBackoff (c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:34676:12)
at streamStep (c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:37798:15)
at fn (c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:38139:9)
at c:\Users\Stefan\.vscode\extensions\poml-team.poml-0.0.8-win32-x64\dist\extension.js:34525:22
Normally you would test with "chat completion models", so the "test with text completion models" look very suspicious. Make sure you don't accidentally hit the non-chat mode when testing.
I'll also improve the debugging by maybe adding a non-streaming mode. The current error message looks too vague and confusing.
Also check out this doc: https://microsoft.github.io/poml/stable/python/integration/mcp/
I think you can use MCP in POML in a better way.
Just for reference, the first time I installed POML on VS Code, I set up GitHub GPT-4.1, and it worked. Of course, MCP did not work, but at least the LLM replied. Now it has stopped working after I installed version 0.0.8.
I used https://microsoft.github.io/poml/stable/typescript/ and not the Python one, as I know Python works for me, at least with the FastAgent MCP client.
I am a .NET software developer using F# or C#, and since the C# version of the MCP client does not work for my use case, I was trying to use a different programming language, in this case, your POML JavaScript or TypeScript.
I'm sorry to hear that POML 0.0.8 stops to work for your models.
Have you checked the chat model vs. text model. Are you sure you are using the chat models from the play button menu?
POML 0.0.8 switched from LangChain backend to Vercel AI SDK due to the complaints about LangChain. So the configuration might be slightly different? To help you debug, would you mind sharing your provider and apiUrl setting?
Yes, I tested both Chat Models and Text Completion Models; both throw the same errors.
May I ask you which TypeScript library offers similar features to Python's FastAgent?
Are you familiar with https://github.com/convo-lang/convo-lang? Its author told me that MCP support is not fully implemented but offered help, so I am waiting for a release.
Don't get me wrong, I am not tied to a specific programming language but am searching for a working solution in any programming language or technology.
Your Microsoft approach reminds me of the issue with https://github.com/modelcontextprotocol/csharp-sdk/discussions/509#discussioncomment-13595867, so it seems that the only effective solution for the MCP client agent workflow is Python's FastAgent, which seems quite strange.
I just opened #143 before seeing this issue. I get the same issue since v0.0.8, however, I don't have anything MCP or tool call related.
@StefanBelo I'm not familiar with convo-lang, but I know other ts tools that have similar features.
Just to clarify, POML only helps you write single prompts. We are not expanding to the whole chat experience or build agents, at least not without other frameworks or programming languages.
I think fast-agent is just a wrapper that hides the details. Similar to other agent frameworks like langchain, OpenAI agent SDK. Don't get me wrong; I mean there's only two fundamental ways to use MCP: use it in a local client and use it from a remote server. We are using it locally, like many do. And as we are not an agent framework, I don't think building the clean wrapper is our top priority.
Wondering if you got any luck with testing feature in VSCode. That looks like a critical issue.
I double checked my config. The following works:
"poml.languageModel.provider": "openai",
"poml.languageModel.model": "gpt-4.1-mini",
"poml.languageModel.apiUrl": "https://aihubmix.com/v1",
"poml.languageModel.apiKey": "sk-xxxxxxx",
@ultmaster Sir, I understand what POML is meant for. In my tests with different LLMs, I noticed that I cannot use the same general prompt on different LLMs because the same prompt generates different results on each LLM.
Therefore, I tried to work with POML to build structured prompts with less effort.
Forget that I mentioned MCP or whatever earlier.
Another person, @Fuzzillogic, as well as myself, have updated POML to 0.0.8, and it has stopped working. That is the issue. Do we understand each other?
You didn't provide your provider and apiUrl. Neither did you compare your config with the config I provided. So I can hardly help you.
I think it's probably a silly problem like missing /v1 or missing /openai or whatever.
I've met the problem mentioned in #143 earlier myself and I had to admit it's a little bit hard to debug. We need to work on improving the debugging experience somehow.
I actually provided such information in my first post, mentioning that I use GitHub models and provided proof that, as they support the OpenAI standard in communication, I was successfully able to interact with this provider model using FastAgent. So, API communication with the model works. It also worked with the previous version of POML, where I used the same settings. Here is the exact configuration in the settings:
"poml.languageModel.provider": "openai", "poml.languageModel.apiUrl": "https://models.github.ai/inference", "poml.languageModel.apiKey": "ghp_xx", "poml.languageModel.apiVersion": "gpt-4.1",
And Sir, read today’s issue, other people report that the new version does not work for them. Yesterday, I read another post where a user mentioned that you use a different POMLJS version in your codebase. Maybe that is related.
I checked the docs of https://models.github.ai/inference and figured that they don't support response API yet.
I'll post a hot-fix to revert to chat completion API quickly.
It seems the DeepSeek API does not support the OpenAI Responses API either.
In your test, you used https://aihubmix.com/models.
Does AIHubMix support the Responses API for the hosted DeepSeek model? We can simply buy credit there. I think most people here pay GitHub for Copilot models, but I tested other providers as well. So, what do you suggest?
The hot-fix is merged. If everything went well, it will appear in the nightly build tonight.
aihubmix is a Chinese website which I used most often. They provide features similar to OpenRouter, but cheaper. They support response API. We also use cherry studio usually, but I didn't personally try it.
The limit for GitHub models is quite low, looking at the docs. Basically can't count on it unless you are simply debugging. And microsoft enterprise account haven't enrolled into the GitHub models yet. So what can I say....
aihubmix is a Chinese website which I used most often. They provide features similar to OpenRouter, but cheaper. They support response API. We also use cherry studio usually, but I didn't personally try it.
The limit for GitHub models is quite low, looking at the docs. Basically can't count on it unless you are simply debugging. And microsoft enterprise account haven't enrolled into the GitHub models yet. So what can I say....
Thank you for your suggestions. Last month, I was unsure why I needed a GitHub Pro subscription for their models and actually wrote a post about it on Reddit.
I now have access to AiHubMix models, and when setting poml.languageModel.apiUrl, the "Test current prompt" functionality started working again.
However, in the AiHubMix logs, I noticed that an unintended model was being used. I had configured it to use "deepseek-chat," but "gpt-4o" was utilized instead. I then tried to adjust all model preferences in your settings form, but the next test on the POML file failed with the same errors I mentioned earlier. Restoring it to a working state was not easy.
The current status: The required behavior is, of course, in GitHub Chat. ConvLang promises to enable such prompt execution, but I still have not tested it yet.
I notice the gpt-4o in this snapshot. So you configured to use deepseek-chat, but here it used gpt-4o?
Because when I tried to use "deepseek-chat," it ended with an error.
What is strange is that when I changed the setting to "deepseek-chat" for the first time, no error was thrown. However, on the next attempt, this error was thrown. Both the GPT-4o and deepseek-chat models are used from https://aihubmix.com/v1, and I do not have any problems at all using any model from https://aihubmix.com/v1 in Cherry Studio, for instance. So, it seems that some kind of problem is in your POML codebase.
Just tried deepseek-chat on aihubmix. It does not support response API either. Chat API works fine.
I recommend to try the nightly build https://poml-vscode-nightly.scottyugochang.workers.dev/
The problem seems quite annoying if you are using non-OpenAI-official models.
I changed the title so that whoever encounters a similar problem could find the solution.
I have some information that may explain the issue with the VS Code extension from my local AI server logs.
I get the same kind of errors as @StefanBelo:
Because when I tried to use "deepseek-chat," it ended with an error.
What is strange is that when I changed the setting to "deepseek-chat" for the first time, no error was thrown. However, on the next attempt, this error was thrown. Both the GPT-4o and deepseek-chat models are used from https://aihubmix.com/v1, and I do not have any problems at all using any model from https://aihubmix.com/v1 in Cherry Studio, for instance. So, it seems that some kind of problem is in your POML codebase.
The prompts may not have the chat_template attribute, which then causes Python code exceptions that prevent prompt execution. This did not occur in v0.0.7, which is why identical server configurations no longer work.
Yep. Still a response API error. Try the nightly build please: https://poml-vscode-nightly.scottyugochang.workers.dev/
use deepseek-chat +1, so this model can not used by poml?
settings.json
"poml.trace": "verbose",
"poml.languageModel.model": "deepseek-chat",
"poml.languageModel.apiKey": "sk-xxxxxxxxxxxxxxxxxx",
"poml.languageModel.apiUrl": "https://api.deepseek.com"
output
2025-10-10 15:52:50.569 [info] Testing prompt with chat model: /poml/examples/109_math_verifier.poml
2025-10-10 15:52:50.661 [info] [Request parameters] {"model":{"specificationVersion":"v2","supportedUrls":{"image/*":[{}]},"modelId":"deepseek-chat","config":{"provider":"openai.responses","fileIdPrefixes":["file-"]}},"maxRetries":0,"temperature":0.5}
2025-10-10 15:52:50.896 [error] AI_APICallError: Not Found
at /Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:67635:14
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async postToApi (/Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:67505:28)
at async OpenAIResponsesLanguageModel.doStream (/Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:94360:50)
at async fn (/Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:37842:27)
at async /Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:34525:22
at async _retryWithExponentialBackoff (/Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:34676:12)
at async streamStep (/Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:37798:15)
at async fn (/Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:38139:9)
at async /Users/duanliming/.vscode/extensions/poml-team.poml-0.0.8-darwin-arm64/dist/extension.js:34525:22
