OSS mode outputting chunks
What version of Codex is running?
0.64 / 0.65
What subscription do you have?
None
Which model were you using?
ollama's gpt-oss:120b-cloud
What platform is your computer?
Darwin 23.6.0 arm64 arm
What issue are you seeing?
When running codex cli with the --oss flag, every query results in normal functioning except, after the answer is provided, many lines of chunks or tokens (or whatever they're called) are provided - sometimes in the hundreds.
What steps can reproduce the bug?
run: codex --oss
enter any query
What is the expected behavior?
No response
Additional information
No response
You mentioned that you're using ollama. I presume that means you're using the older "chat/completions" API, since ollama doesn't yet support the newer "responses" API, which was designed for agentic workflows like codex. You might want to try LM Studio, which supports gpt-oss using the "responses" API. Make sure to specify wire_api = "responses" in your config file.
With the new update, my local setup (llama.cpp) with litellm is acting different too. Here's the relevant part of my ~/.codex/config.toml:
oss_provider = "litellm"
[model_providers.litellm]
name = "litellm"
base_url = "http://LITELLM/v1"
env_key = "LITELLM_MASTER_KEY"
wire_api = "chat"
When I use codex with my local gpt-oss-120b I get:
After this, I tried:
oss_provider = "litellm"
[model_providers.litellm]
name = "litellm"
base_url = "http://LITELLM/v1"
env_key = "LITELLM_MASTER_KEY"
wire_api = "responses"
But this actually results in a worse output as now litellm can't find my model:
■ unexpected status 404 Not Found: {"error":
{"message":"litellm.NotFoundError: NotFoundError: OpenAIException - .
Received Model Group=gpt-oss-120b-codex-agent\nAvailable Model Group
Fallbacks=None","type":null,"param":null,"code":"404"}}
There's a good chance that this is a litellm problem. But on the Codex side, is there a way to use the previous endpoint just for providers that don't support responses yet? For now I'm planning to stick with the older codex version.
Also, I tested this with v0.63.0, 64, and 65. This issue first appears on v64.
Are you using the latest version of litellm? My understanding is that it recently added support for "responses".
I can confirm I'm using the latest stable. Will probably open an issue over at the litellm repo as well when I get the time.
This may be the same underlying problem as #7579, which we just fixed in version 0.66.0-alpha.5.
I saw this with an earlier version of LM Studio and codex. I'm no longer able to repro this with the latest version of LM Studio and the latest prerelease version of codex (0.66.0-alpha.10).
With 0.66.0 released, I think this is now fixed. Let us know if you're still seeing it.
This still happens on v0.71.0:
$ codex --oss
╭──────────────────────────────────────────────────╮
│ >_ OpenAI Codex (v0.71.0) │
│ │
│ model: gpt-oss:20b medium /model to change │
│ directory: ~/src/codex │
╰──────────────────────────────────────────────────╯
Tip: Switch models or reasoning effort quickly with /model.
› hi
• Hi! How can I help with the Codex repo today?
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• Hi
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• !
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• How
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• can
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• I
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• help
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• with
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• the
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• Cod
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• ex
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• repo
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• today
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
• ?
─ Worked for 4s ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
› Improve documentation in @filename
100% context left · ? for shortcuts
ollama version is 0.13.1
When used without --oss (via llama.cpp's llama-server using the same model) there is no problem.
I can confirm the issue with codex-cli 0.72.0 and ollama 0.13.3 as well.
The bug report seems to be ignored. Should we create new one?
^ please re-open this issue @etraut-openai