codex icon indicating copy to clipboard operation
codex copied to clipboard

fix: preserve assistant/tool call ordering in recorded turns

Open jxy opened this issue 1 month ago • 9 comments

commit cfcc87a95 changed how response items are batched and recorded, which caused tool call objects to be recorded before the assistant message content in some turns.

This could result in a { role: "assistant", content: null, tool_calls: [...] } message being sent before the actual assistant message with text content, breaking ordering expectations downstream.

This change reorders the items collected for a single turn so that the last non-empty assistant message is placed immediately before the first tool call object when both occur in the same turn, without perturbing prior history.

A new unit test ensures that recorded items now appear in the expected order: assistant text, then tool call object, then the corresponding tool output input item.

jxy avatar Nov 20 '25 23:11 jxy

Thanks for the contribution.

Is there an open bug report associated with this PR? We ask that all bug fix PRs have a linked bug report (see contribution guidelines).

It's not clear to me what problem this PR is solving. You mentioned that the recent change is "breaking ordering expectations downstream". Whose expectations, and how does that manifest?

etraut-openai avatar Nov 20 '25 23:11 etraut-openai

It only affects chat/completions endpoint with OpenAI compatible API servers.

With the current codex:

codex-cli 0.61.0

Test with the following. Run a test server (needs nc and jq):

{
cat <<'EOF'
HTTP/1.1 200 OK
Content-Type: text/event-stream

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"content":"ASSISTANT MESSAGE HERE"},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"content":null,"tool_calls":[{"index":0,"id":"call_testid","type":"function","function":{"name":"shell","arguments":""}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"command\":[\"bash\",\"-lc\",\"uname\"],\"timeout_ms\":120000,\"workdir\":\"/tmp\"}"}}]},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{},"finish_reason":"tool_calls"}]}

data: [DONE]
EOF
} | nc -l 8888 | sed '/^[^{]/d' | jq '.messages[2:]'
{
cat <<'EOF'
HTTP/1.1 200 OK
Content-Type: text/event-stream

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{"content":"CHECK WHAT I RECEIVED"},"finish_reason":null}]}

data: {"id":"chatcmpl-testid","object":"chat.completion.chunk","created":1763690000,"model":"gpt-test","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
EOF
} | nc -l 8888 | sed '/^[^{]/d' | jq '.messages[2:]'

From another terminal connect with codex

codex \
-c 'profiles.test.model_provider="test"' \
-c 'model_providers.test.name="test"' \
-c 'model_providers.test.base_url="http://localhost:8888/v1"' \
-p test --full-auto \
'A quick test.'

The our nc server will print the following:

[
  {
    "role": "user",
    "content": "A quick test."
  }
]
[
  {
    "role": "user",
    "content": "A quick test."
  },
  {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_testid",
        "type": "function",
        "function": {
          "name": "shell",
          "arguments": "{\"command\":[\"bash\",\"-lc\",\"uname\"],\"timeout_ms\":120000,\"workdir\":\"/tmp\"}"
        }
      }
    ]
  },
  {
    "role": "assistant",
    "content": "ASSISTANT MESSAGE HERE"
  },
  {
    "role": "tool",
    "tool_call_id": "call_testid",
    "content": "Exit code: 0\nWall time: 0 seconds\nOutput:\nDarwin\n"
  }
]

You can see that the "ASSISTANT MESSAGE HERE" was sent after the tool_calls request. Previous versions had the assistant content come before tool_calls, which should be the correct order. Bisecting the commits led me to cfcc87a, which intended to move tool_calls before the results of the tool call, yet neglected the case where assistant may send non-empty content that got placed after tool_calls.

This commit moves assistant messages, if there's any, before the tool_calls. And the unit test assures that. With this change, you should see the correct order

[
  {
    "role": "user",
    "content": "A quick test."
  }
]
[
  {
    "role": "user",
    "content": "A quick test."
  },
  {
    "role": "assistant",
    "content": "ASSISTANT MESSAGE HERE"
  },
  {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_testid",
        "type": "function",
        "function": {
          "name": "shell",
          "arguments": "{\"command\":[\"bash\",\"-lc\",\"uname\"],\"timeout_ms\":120000,\"workdir\":\"/tmp\"}"
        }
      }
    ]
  },
  {
    "role": "tool",
    "tool_call_id": "call_testid",
    "content": "Exit code: 0\nWall time: 0 seconds\nOutput:\nDarwin\n"
  }
]

jxy avatar Nov 21 '25 04:11 jxy

And I just saw a bug report.

This PR should fix #7051

jxy avatar Nov 21 '25 04:11 jxy

codex --version codex-cli 0.61.0

I'm using [email protected] version via brew cask on my local, always face error as following, is there a way to switch back to previous cask version? I tried but couldn't install by using "brew install --cask [email protected]" or so. { "error": { "message": "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_2hWYykIE9bKkFT6AFNOPseAN", "type": "invalid_request_error", "param": "messages.[5].role", "code": null } }

211211 avatar Nov 21 '25 11:11 211211

I believe this issue is model-dependent. When the messages are arranged in the sequence: [assistant (tool call), assistant (content), tool (content)]

Some models return an error while others do not. For example, when I used glm-4.6 and sent requests in that order, no error occurred. When I submitted the same request to deepseek-chat, it returned an HTTP 400 response with the following error:

ERROR: unexpected status 400 Bad Request: {"error":{"message":"An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. (insufficient tool messages following tool_calls message)","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

But it indeed affect the GPT-series models.

Below is the request log I captured from Codex (version: 0.63.0):

{
    "model": "glm-4.6",
    "messages": [
        {
            "role": "system",
            "content": "You are a coding agent running in the Codex CLI, a terminal-based coding assistant. Codex CLI is....."
        },
        {
            "role": "user",
            "content": "<environment_context>...</environment_context>"
        },
        {
            "role": "user",
            "content": "write foo into bar file and save in the current location"
        },
        {
            "role": "assistant",
            "content": null,
            "tool_calls": [
                {
                    "id": "call_c64a59b760fa422e8e603a2f",
                    "type": "function",
                    "function": {
                        "name": "exec_command",
                        "arguments": "{\"cmd\":\"echo \\\"foo\\\" > bar\"}"
                    }
                }
            ]
        },
        {
            "role": "assistant",
            "content": "\nI'll create a file named \"bar\" with the content \"foo\" in the current location.\n"
        },
        {
            "role": "tool",
            "tool_call_id": "call_c64a59b760fa422e8e603a2f",
            "content": "Chunk ID: 987afe\nWall time: 0.0261 seconds\nProcess exited with code 0\nOriginal token count: 0\nOutput:\n"
        }
    ],
    "stream": true,
    "tools": [...]
}

And the Codex can finish the above task with glm only.

Below is my model config:

[model_providers.glm_4_6]
name = "glm_4_6"
base_url = "https://open.bigmodel.cn/api/coding/paas/v4"
env_key = "GLM_API_KEY"
wire_api = "chat"
query_params = {}
[model_providers.deepseek-chat]
name = "DeepSeek"
base_url = "https://api.deepseek.com"
env_key = "DEEPSEEK_API_KEY"
wire_api = "chat"
query_params = {}

448523760 avatar Nov 22 '25 14:11 448523760

@etraut-openai any chance this could be merged to fix #7051 ?

svkozak avatar Nov 25 '25 19:11 svkozak

Also seeing this error, had to downgrade to 0.58.0 😢

na-jakobs avatar Nov 26 '25 19:11 na-jakobs

@codex review

jif-oai avatar Dec 01 '25 18:12 jif-oai

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Also affected by #7051 at my institution, would love to see this merged in!

robinson96 avatar Dec 02 '25 21:12 robinson96

Here's a quick update on this issue... The correct fix for this problem is more involved than the proposed fix in this PR. We're working on a separate fix. I'll leave this PR open for now, but we'll likely end up closing it once we merge the other change.

etraut-openai avatar Dec 02 '25 22:12 etraut-openai

The fix that I mentioned above was just merged: #7310. That should address this issue, so I'm going to close this PR.

Thanks @jxy for taking the time to implement a fix and submit a PR.

etraut-openai avatar Dec 04 '25 22:12 etraut-openai