Support for OpenAI Responses API
Currently, dstack Models work with Chat Completions API, but since March OpenAI has introduced the Responses API (migration guide). OpenAI says that the Chat Completions API will not be deprecated, but Responses API is the recommended default, with new features and improved efficiency. Ecosystem projects (vLLM, Ollama, etc.) are actively adding Responses API support, mainly because Codex does not work with Chat Completions.
In our docs, chat model is said to have OpenAI-compatible endpoint, which is confusing since there are different OpenAI formats with Responses API being the new default.
https://github.com/dstackai/dstack/blob/38e66bc777c67f477d01b7c0b11a30343cf8e1fe/docs/docs/concepts/services.md?plain=1#L163-L165
Consider adding Responses API support to dstack if we continue to provide OpenAI-compatible endpoints.