claude-code icon indicating copy to clipboard operation
claude-code copied to clipboard

[Meta] tool_use/tool_result block mismatch causing bad conversation state (150+ reports)

Open emcd opened this issue 6 months ago • 28 comments

Environment

  • Platform (select one):
    • [x] Anthropic API, others
  • Claude CLI version: multiple, including 1.0.94; have encountered the issue after the alleged fix in 1.0.84
  • Operating System: multiple
  • Terminal: multiple

Bug Description

Claude Code users are experiencing conversation disruptions due to mismatched Tool Use and Tool Result blocks. This can occur due to transient network failures, server errors, or other failures around hook executions or tool calls. (Both times I have seen this in the past two weeks have been after a hook execution.) As of this writing the following issue tracker query ( is:issue Each tool_use block must have a corresponding tool_result block in the next message) shows 150+ issues.

Open Issues (63): #5662, #6443, #6551, #3886, #3636, #6302, #5599, #5509, #3512, #3632, #3860, #4454, #3916, #3003, #5928, #5374, #6628, #5765, #3637, #2126, #6242, #2616, #2312, #5747, #1800, #3504, #2967, #2961, #2406, #2903, #2352, #2249, #2959, #3754, #6178, #2232, #2704, #4401, #2242, #2802, #5713, #1978, #4425, #3549, #2369, #6585, #2043, #2672, #2261, #4409, #3532, #4466, #3608, #1796, #2971, #3564, #4038, #3983, #5385, #6595, #769, #1672, #3982, #1608

Closed Issues (92): #6575, #6566, #6539, #6567, #5705, #5989, #6521, #5468, #6363, #6410, #5424, #5375, #5450, #5317, #4842, #5060, #4638, #4664, #4563, #3086, #5470, #6343, #4096, #3639, #5594, #4825, #4180, #4798, #1887, #5193, #5479, #3101, #6348, #3331, #6502, #4877, #2697, #1831, #3149, #2157, #1782, #4522, #1776, #1686, #2840, #5410, #2796, #2828, #1894, #4134, #2600, #1574, #1237, #5457, #1881, #3902, #3977, #2179, #1968, #4159, #2201, #2423, #6490, #1747, #1642, #1969, #1738, #1584, #2630, #1577, #3862, #3202, #473, #1586, #1695, #5412, #1571, #5246, #3191, #1956, #1856, #4240, #2708, #1678, #1174, #1579, #558, #1761, #2494, #5476, #5421, #4211, #4283, #2045, #2582, #746, #3985, #4065, #3616, #1679

(My apologies to anyone whose issue was erroneously linked - I asked Claude to pull the gh issue list results together for me and did not verify each and every link.)

Most of the closed issues are closed because they are duplicates, not because they are resolved. Some of the closed issues have significant threads with multiple people confirming the issue, which means that the hit rate is significantly higher than 150+ issues.

One person developed a remediation, which involves exporting the conversation, clearing the history, restarting Claude Code, and then importing the conversation history. While it is good to have a workaround like this, it is obviously painful and an in-program mitigation would be preferred.

Steps to Reproduce

Not consistently reproducible.

Expected Behavior

Graceful recovery.

Actual Behavior

No more conversation turns can be generated.

Additional Context

To deal with this issue in my personal AI workbench last year, I developed the following Python code which can be adapted to Typescript/Node.js fairly easily. It makes an O(n) backwards pass through the message history, collecting tool use IDs from results and then detecting any tool uses which do not have those IDs in the collected set:

    ''' Filters out tool use blocks that have no matching result. '''
    tool_result_ids = set( )
    filtered_messages: list[ AnthropicMessage ] = [ ]
    for message in reversed( messages ):
        content = message[ 'content' ]
        if not isinstance( content, list ):
            filtered_messages.append( message )
            continue
        filtered_blocks = [ ]
        for block in content:
            if not isinstance( block, dict ):
                filtered_blocks.append( block )
                continue
            match block.get( 'type' ):
                case 'tool_result':
                    if 'tool_use_id' in block:
                        tool_result_ids.add( block[ 'tool_use_id' ] )
                    filtered_blocks.append( block )
                case 'tool_use':
                    if block.get( 'id' ) in tool_result_ids:
                        filtered_blocks.append( block )
                case _:
                    filtered_blocks.append( block )
        if filtered_blocks:
            message_filtered = dict( message )
            message_filtered[ 'content' ] = filtered_blocks
            filtered_messages.append( message_filtered )
    return list( reversed( filtered_messages ) )

Or, you could open your sources so that others could help you patch serious issues....

emcd avatar Aug 29 '25 19:08 emcd

It is indeed a rather irritating bug, especially since fixing it (or at least, adding a workaround while you figure out the root cause...) is really not very difficult. Claude code is essentially unusable at the moment for me.

ldorigo avatar Sep 01 '25 19:09 ldorigo

@bcherny I remember you were on top of this issue last time I had it (source) thought it might be helpful to ping incase of useful context on prior fix etc!

My steps to reproduce:

  • 1 begin subagent task delegation step
    • induce network dropout (e.g, disconnect wifi) - dropouts happen frequently for me in real-world use
  • 2 note "API timeout" error begins to occur
    • reconnect internet
    • note subagent appears frozen
  • 3 press "ESC" and revert to message just before subagent delegation began
    • re-send message (or send a different one)
  • ✅ reproduced - note that the conversation history is now broken
    • will deterministically get invalid_request_error.

screenshot of mine:

Image

zazer0 avatar Sep 17 '25 06:09 zazer0

Thanks for pinging the lead developer, @zazer0 . Was tempted to do this soon, but wanted to give the team to work through the huge volume of issues they have before flagging harder. Looks like a couple of Anthropic-associated people are assigned now.


Update:

I've been bitten by this several more times in the past week, using the latest 1.0.1xy versions of Claude Code.

And here are the related issues which have come in since this issue was filed: #6859, #7327, #7484, #7570, #7691, #7796

emcd avatar Sep 18 '25 00:09 emcd

This issue has a number of causes. While we are always monitoring instances of this error and and looking to fix them, it's unlikely we will ever completely eliminate it due to how tricky concurrency problems are in general. Feel free to share transcripts and complaints here, we'll keep an eye on this issue and close the many duplicates

wolffiex avatar Sep 18 '25 20:09 wolffiex

This issue has a number of causes.

Correct and this was stated in the original post (failed tool calls, transient network failures, server errors).

While we are always monitoring instances of this error and and looking to fix them, it's unlikely we will ever completely eliminate it due to how tricky concurrency problems are in general.

Part of what the original post suggested is that you mitigate the symptoms rather than trying to track down and eliminate every root cause. Some code, which is successfully used to mitigate them in another piece of software, was provided as reference.

I am betting that most people would rather have Claude Code drop an occasional failed hook (or whatever) and continue working than have their entire conversation come to a hard stop until they edit the offending tool uses/results out of the underlying JSON.


Open source agentic coders, like Opencode, have essentially caught up to you (and even surpassed you in some areas)... and can even use your models. And other ones, like Codex CLI and Gemini CLI, are rapidly closing the gap.

Your team seems fairly small. If it wants to spend its resources on creating rainbow-colored "ultrathink" keywords rather than mitigating actual user pain, that's fine, but it might not play out well over the course of the next few months.

emcd avatar Sep 18 '25 22:09 emcd

This is killing me. It stops my autonomous workflows and prevents me from going back and forth in "finished" workflows. For example, I can't interrupt a subagent and give it guidance because most of the time, in such cases, I get this error. And sometimes subagents stop working, and I tell them to continue, only to find out they stopped because of this error, which reappears after my prompt.

I've given some examples and analyzed many related issues in https://github.com/anthropics/claude-code/issues/5374.

I've had this error from version v1.0.71, and I still have it in v1.0.126.

almirsarajcic avatar Sep 26 '25 12:09 almirsarajcic

@wolffiex @igorkofman : Does Claude Code 2.0 use the recently-announced context-management-2025-06-27 beta header or are there plans to have it use this header? Based on my reading of the documentation, it seems to relax the API requirement for matching tool use and tool use result blocks. This would mitigate the majority, if not all of the issues, reported here, if true.

Hint at possibly relaxed API strictness from documentation:

When activated, the API automatically clears the oldest tool results in chronological order, replacing them with placeholder text to let Claude know the tool result was removed


Edit: I see the 2.0 release notes do not say anything about the header but have this beacon of hope:

• Hooks: Reduced PostToolUse 'tool_use' ids were found without 'tool_result' blocks errors

Thanks. Let's hope this is a real fix this time.

emcd avatar Sep 29 '25 22:09 emcd

Happens maybe less often, but still there in 2.0.1 with Sonnet 4.5.

Image

almirsarajcic avatar Sep 30 '25 07:09 almirsarajcic

TypeScript Implementation Proposal

Here's a TypeScript/Node.js adaptation of the Python solution provided in the issue, with additional implementation ideas:

Root Cause

When a tool execution is interrupted (via Ctrl+C or timeout), Claude Code fails to insert the required tool_result block, violating the Claude API specification that every tool_use must be immediately followed by a corresponding tool_result.

TypeScript Implementation

interface AnthropicMessage {
  role: string;
  content: string | ContentBlock[];
}

interface ContentBlock {
  type: string;
  id?: string;
  tool_use_id?: string;
  [key: string]: any;
}

/**
 * Filters out tool_use blocks that have no matching tool_result.
 * Makes an O(n) backwards pass to collect result IDs first.
 */
function filterOrphanedToolUses(messages: AnthropicMessage[]): AnthropicMessage[] {
  const toolResultIds = new Set<string>();
  const filteredMessages: AnthropicMessage[] = [];

  // Backwards pass to collect tool_result IDs
  for (const message of [...messages].reverse()) {
    const content = message.content;

    if (typeof content === 'string') {
      filteredMessages.push(message);
      continue;
    }

    const filteredBlocks: ContentBlock[] = [];

    for (const block of content) {
      if (typeof block !== 'object') {
        filteredBlocks.push(block);
        continue;
      }

      switch (block.type) {
        case 'tool_result':
          if (block.tool_use_id) {
            toolResultIds.add(block.tool_use_id);
          }
          filteredBlocks.push(block);
          break;

        case 'tool_use':
          // Only include tool_use if we have a matching result
          if (block.id && toolResultIds.has(block.id)) {
            filteredBlocks.push(block);
          }
          break;

        default:
          filteredBlocks.push(block);
      }
    }

    if (filteredBlocks.length > 0) {
      filteredMessages.push({
        ...message,
        content: filteredBlocks
      });
    }
  }

  return filteredMessages.reverse();
}

Interrupt Handler Implementation

Additionally, here's how to properly handle interrupts to prevent the issue:

class ToolExecutionManager {
  private activeToolCalls = new Map<string, AbortController>();

  async executeToolWithInterruptHandling(
    toolUseId: string,
    toolName: string,
    toolParams: any,
    onInterrupt?: () => void
  ): Promise<{ result: any; interrupted: boolean }> {
    const abortController = new AbortController();
    this.activeToolCalls.set(toolUseId, abortController);

    // Register interrupt handlers
    const cleanup = () => {
      process.removeListener('SIGINT', interruptHandler);
      process.removeListener('SIGTERM', interruptHandler);
      this.activeToolCalls.delete(toolUseId);
    };

    const interruptHandler = () => {
      abortController.abort();
      if (onInterrupt) onInterrupt();
    };

    process.on('SIGINT', interruptHandler);
    process.on('SIGTERM', interruptHandler);

    try {
      const result = await this.executeTool(
        toolName,
        toolParams,
        abortController.signal
      );
      cleanup();
      return { result, interrupted: false };
    } catch (error) {
      cleanup();

      if (error.name === 'AbortError' || abortController.signal.aborted) {
        // Tool was interrupted - return synthetic result
        return {
          result: {
            type: 'tool_result',
            tool_use_id: toolUseId,
            is_error: true,
            content: 'Tool execution interrupted by user'
          },
          interrupted: true
        };
      }

      throw error;
    }
  }

  private async executeTool(
    name: string,
    params: any,
    signal: AbortSignal
  ): Promise<any> {
    // Actual tool execution logic here
    // Should check signal.aborted periodically for long operations
  }
}

Immediate Mitigation

For immediate relief, users can add this to their shell config to detect and warn about the issue:

# Add to ~/.bashrc or ~/.zshrc
alias claude-check='claude-code history export | grep -c "tool_use" | xargs -I {} sh -c '\''uses={}; results=$(claude-code history export | grep -c "tool_result"); if [ "$uses" -ne "$results" ]; then echo "⚠️  WARNING: Mismatched tool blocks detected ($uses uses, $results results). Consider restarting."; fi'\'''

Testing

I've tested the filter function with various scenarios including:

  • ✅ Properly paired tool_use/tool_result blocks (preserved correctly)
  • ✅ Orphaned tool_use blocks from interrupts (removed as expected)
  • ✅ Mixed valid and orphaned blocks (filters correctly)
  • ✅ String content messages (preserved)
  • ✅ Edge cases like empty arrays

The TypeScript implementation is adapted from the Python solution in the original issue and has been verified to work correctly. The interrupt handler is a proposed approach to proactively prevent the issue.

Hope this helps!

jamestexas avatar Oct 04 '25 01:10 jamestexas

Although I have seen no further problems myself, it is clear that many people are still being bitten by this. Here is the latest batch since last update (courtesy of Claude):

Open issues: #7929, #7947, #7991, #8004, #8077, #8187, #8201, #8303, #8325, #8425, #8507, #8612, #8652, #8746, #8763, #8783, #8790, #8817, #8818, #8821, #8847, #8867, #8887, #8893, #8894, #8895, #8903, #8929, #8931, #8940

Closed (as duplicate) issues: #7882, #8190, #8210, #8231, #8233, #8261, #8286, #8337, #8393, #8420, #8553, #8565, #8573, #8822

That is another 44 issues in the matter of several weeks, including after the alleged fix in 2.0.0, @wolffiex @igorkofman .

emcd avatar Oct 05 '25 01:10 emcd

Summary of me rn: ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh

BY-SAMBO avatar Oct 06 '25 08:10 BY-SAMBO

This is making Claude Code output the HOOKS OUTPUT as USER MESSAGES and KEEPS REACTING to them in a NEVERENDING LOOP basically.

Just unacceptable.

Why is this happening? I can supply you guys with all my hooks or my whole config if you like.

NEVER had this issue before Claude Code 2.0.

semikolon avatar Oct 07 '25 13:10 semikolon

For immediate relief, users can add this to their shell config to detect and warn about the issue:

How/when would I use claude-check exactly? Could one surgically modify the transcript file to fix the issue, perhaps? @emcd

semikolon avatar Oct 07 '25 13:10 semikolon

For immediate relief, users can add this to their shell config to detect and warn about the issue:

How/when would I use claude-check exactly? Could one surgically modify the transcript file to fix the issue, perhaps? @emcd

@semikolon : I am not the one who posted the one-liner. It was @jamestexas . That said, my reading is that it simply detects certain manifestations of the issue and recommends a restart. In my past experience, restarting does not fix the issue; one needs to actually edit the transcript. (But, it is possible that Claude 2.0.x has some sort of rectification on restart now. I have not experienced the issue since Claude 2.0.0, so I cannot confirm whether this is true. And, given the new usage limits, my probability of encountering the problem is about 1/6th of what it used to be....)

If you're asking whether I have suggestions about automatically editing the transcript, I could probably put something together. But, I would be concerned about Anthropic changing the format.... Anthropic really needs to address this from their end.

emcd avatar Oct 07 '25 17:10 emcd

For immediate relief, users can add this to their shell config to detect and warn about the issue:

How/when would I use claude-check exactly? Could one surgically modify the transcript file to fix the issue, perhaps? @emcd

@semikolon : I am not the one who posted the one-liner. It was @jamestexas . That said, my reading is that it simply detects certain manifestations of the issue and recommends a restart. In my past experience, restarting does not fix the issue; one needs to actually edit the transcript. (But, it is possible that Claude 2.0.x has some sort of rectification on restart now. I have not experienced the issue since Claude 2.0.0, so I cannot confirm whether this is true. And, given the new usage limits, my probability of encountering the problem is about 1/6th of what it used to be....)

If you're asking whether I have suggestions about automatically editing the transcript, I could probably put something together. But, I would be concerned about Anthropic changing the format.... Anthropic really needs to address this from their end.

Thanks. That's weird, I've basically only experienced this issue since CC 2.0 / Sonnet 4.5 came out.

But I think it's due to concurrent tool use. I was about to let CC analyze my JSONL transcript files to identify patterns in what causes these errors. Started a subagent on reviewing transcripts and then hit my weekly limit seconds after 😝

semikolon avatar Oct 08 '25 09:10 semikolon

Thanks. That's weird, I've basically only experienced this issue since CC 2.0 / Sonnet 4.5 came out.

But I think it's due to concurrent tool use. I was about to let CC analyze my JSONL transcript files to identify patterns in what causes these errors. Started a subagent on reviewing transcripts and then hit my weekly limit seconds after 😝

😝 Ridiculous, isn't it? I really cannot get any useful work done with CC anymore. I've already started shifting to Codex CLI and have been evaluating Opencode with the latest DeepSeek model (and might give Grok Code Fast 1 another try too, even though my initial impressions were not good last month). At this point, metered API usage with the cheap models might be better than the scraps that are being handed out on the subscriptions. (Had a comfortable working relationship with Claude though. The other models are generally not as good at instruction following and lack the character that Claude has.) Might also go back to using the Mimeogram tool that I made earlier this year before the agentic coders let people tie to their subscriptions; the usage limits via the Claude.ai GUI do not seem as strict, especially if caching of project files is involved.

Anyway, I'm going to leave this bug report open since it is clear that you and others are still seeing problems with mismatched tool use blocks. The major trigger for the problem prior to Claude 2.0.0 was hooks running after parallel tool uses. Not sure what is happening now.

emcd avatar Oct 08 '25 22:10 emcd

Thanks. That's weird, I've basically only experienced this issue since CC 2.0 / Sonnet 4.5 came out.

But I think it's due to concurrent tool use. I was about to let CC analyze my JSONL transcript files to identify patterns in what causes these errors. Started a subagent on reviewing transcripts and then hit my weekly limit seconds after 😝

😝 Ridiculous, isn't it? I really cannot get any useful work done with CC anymore. I've already started shifting to Codex CLI and have been evaluating Opencode with the latest DeepSeek model (and might give Grok Code Fast 1 another try too, even though my initial impressions were not good last month). At this point, metered API usage with the cheap models might be better than the scraps that are being handed out on the subscriptions. (Had a comfortable working relationship with Claude though. The other models are generally not as good at instruction following and lack the character that Claude has.) Might also go back to using the Mimeogram tool that I made earlier this year before the agentic coders let people tie to their subscriptions; the usage limits via the Claude.ai GUI do not seem as strict, especially if caching of project files is involved.

Anyway, I'm going to leave this bug report open since it is clear that you and others are still seeing problems with mismatched tool use blocks. The major trigger for the problem prior to Claude 2.0.0 was hooks running after parallel tool uses. Not sure what is happening now.

Should work ok with latest release. Haven't tried but otherwise here's a workaround which has worked for me:

Safety protocol for parallel tool call bug: Add this to your CLAUDE.md to get rid of this issue until it's fixed by Anthropic (comment)

semikolon avatar Oct 20 '25 08:10 semikolon

This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.

github-actions[bot] avatar Dec 11 '25 10:12 github-actions[bot]

This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.

@semikolon (or anyone else): Do we need to keep this issue open? I personally have not been experiencing the problem for quite some time, but some of that may be due to drastically reduced Claude usage since the end of September. I think you have been more on top of it than I have.

emcd avatar Dec 11 '25 17:12 emcd

I haven't had this issue for a long time either.

almirsarajcic avatar Dec 12 '25 07:12 almirsarajcic

I just experienced it

marcosalins avatar Dec 14 '25 15:12 marcosalins

I just experienced it

Did you send in a full bug report?

semikolon avatar Dec 14 '25 17:12 semikolon

Same as @marcosalins

API Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"mes
    sages.160.content.3: unexpected `tool_use_id` found in `tool_result` blocks: 
    toolu_01Tyaf42kExvoow7R56b5jkK. Each `tool_result` block must have a corresponding 
    `tool_use` block in the previous 
    message."},"request_id":"req_011CW9fhj2YuEuwxUhUw5b6j"}

rsmaximiliano avatar Dec 16 '25 03:12 rsmaximiliano

Faced it today.

API Error: 400 {"type":"error","error":{"type":"invalid_request_error",
      "message":"messages.0.content.0: unexpected
      `tool_use_id` found in `tool_result` blocks: toolu_01TB1ZFhtFTh5M8kuKDQXydt. 
      Each `tool_result` block must have a corresponding `tool_use` block in the previous message."},
      "request_id":"req_011CWCJiiepr2w4ibaSwuPJK"}

theamrendrasingh avatar Dec 17 '25 13:12 theamrendrasingh