litellm icon indicating copy to clipboard operation
litellm copied to clipboard

[Bug]: Bedrock thinking model with tools fail with Expected `thinking` or `redacted_thinking`, but found `tool_use`.

Open tinuva opened this issue 3 months ago • 24 comments

What happened?

When I create an agent in LibreChat with tools enabled on a Bedrock model with thinking enabled, I receive the following message somewhere after asking a question:

Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block

The combination that usually trigger this is:

  • Model with thinking
  • Tools
  • Bedrock provider using LiteLLM proxy

Here is a sample conversation that usually trigger this. example uses claude sonnet 3.7 thinking

Image

Relevant log output

API ERROR:

400 litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}. Received Model Group=bedrock/claude-3-7-sonnet-thinking Available Model Group Fallbacks=None


Request log (from webui):

{
  "user": "67c0830e48d996cbba519e38",
  "model": "bedrock/claude-3-7-sonnet-thinking",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "google",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "query"
          ],
          "properties": {
            "query": {
              "type": "string",
              "minLength": 1,
              "description": "The search query string."
            },
            "max_results": {
              "type": "number",
              "maximum": 10,
              "minimum": 1,
              "description": "The maximum number of search results to return. Defaults to 10."
            }
          },
          "additionalProperties": false
        },
        "description": "A search engine optimized for comprehensive, accurate, and trusted results. Useful for when you need to answer questions about current events."
      }
    },
    {
      "type": "function",
      "function": {
        "name": "search_pages_mcp_wikijs",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "query"
          ],
          "properties": {
            "path": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "string",
                      "description": "Optional path to limit search scope"
                    }
                  ],
                  "description": "Optional path to limit search scope"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Optional path to limit search scope"
            },
            "query": {
              "type": "string",
              "description": "Search query to find pages"
            },
            "locale": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "string",
                      "description": "Optional locale for the search (e.g., \"en\")"
                    }
                  ],
                  "description": "Optional locale for the search (e.g., \"en\")"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Optional locale for the search (e.g., \"en\")"
            }
          },
          "additionalProperties": false
        },
        "description": "Search for pages in WikiJS by query string"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "get_page_by_id_mcp_wikijs",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "id"
          ],
          "properties": {
            "id": {
              "type": "number",
              "description": "The ID of the page to retrieve"
            }
          },
          "additionalProperties": false
        },
        "description": "Get a WikiJS page by its ID"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "get_page_by_path_mcp_wikijs",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "path",
            "locale"
          ],
          "properties": {
            "path": {
              "type": "string",
              "description": "The path of the page to retrieve"
            },
            "locale": {
              "type": "string",
              "description": "The locale of the page (e.g., \"en\")"
            }
          },
          "additionalProperties": false
        },
        "description": "Get a WikiJS page by its path and locale"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "get_all_pages_mcp_wikijs",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "properties": {},
          "additionalProperties": false
        },
        "description": "Get all pages in WikiJS"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "reddit_mcp-fetch_reddit_hot_threads_mcp_LiteLLM",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "subreddit"
          ],
          "properties": {
            "limit": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {}
                  ]
                },
                {
                  "type": "null"
                }
              ]
            },
            "subreddit": {
              "type": "string"
            }
          },
          "additionalProperties": false
        },
        "description": "Fetch hot threads from a subreddit\n\nArgs:\n    subreddit: Name of the subreddit\n    limit: Number of posts to fetch (default: 10)\n\nReturns:\n    Human readable string containing list of post information"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "reddit_mcp-fetch_reddit_post_content_mcp_LiteLLM",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "post_id"
          ],
          "properties": {
            "post_id": {
              "type": "string"
            },
            "comment_depth": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {}
                  ]
                },
                {
                  "type": "null"
                }
              ]
            },
            "comment_limit": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {}
                  ]
                },
                {
                  "type": "null"
                }
              ]
            }
          },
          "additionalProperties": false
        },
        "description": "Fetch detailed content of a specific post\n\nArgs:\n    post_id: Reddit post ID\n    comment_limit: Number of top level comments to fetch\n    comment_depth: Maximum depth of comment tree to traverse\n\nReturns:\n    Human readable string containing post content and comments tree"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "videos_getVideo_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "videoId"
          ],
          "properties": {
            "parts": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "array",
                      "items": {
                        "type": "string"
                      },
                      "description": "Parts of the video to retrieve"
                    }
                  ],
                  "description": "Parts of the video to retrieve"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Parts of the video to retrieve"
            },
            "videoId": {
              "type": "string",
              "description": "The YouTube video ID"
            }
          },
          "additionalProperties": false
        },
        "description": "Get detailed information about a YouTube video"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "videos_searchVideos_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "query"
          ],
          "properties": {
            "query": {
              "type": "string",
              "description": "Search query"
            },
            "maxResults": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "number",
                      "description": "Maximum number of results to return"
                    }
                  ],
                  "description": "Maximum number of results to return"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Maximum number of results to return"
            }
          },
          "additionalProperties": false
        },
        "description": "Search for videos on YouTube"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "transcripts_getTranscript_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "videoId"
          ],
          "properties": {
            "videoId": {
              "type": "string",
              "description": "The YouTube video ID"
            },
            "language": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "string",
                      "description": "Language code for the transcript"
                    }
                  ],
                  "description": "Language code for the transcript"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Language code for the transcript"
            }
          },
          "additionalProperties": false
        },
        "description": "Get the transcript of a YouTube video"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "channels_getChannel_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "channelId"
          ],
          "properties": {
            "channelId": {
              "type": "string",
              "description": "The YouTube channel ID"
            }
          },
          "additionalProperties": false
        },
        "description": "Get information about a YouTube channel"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "channels_listVideos_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "channelId"
          ],
          "properties": {
            "channelId": {
              "type": "string",
              "description": "The YouTube channel ID"
            },
            "maxResults": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "number",
                      "description": "Maximum number of results to return"
                    }
                  ],
                  "description": "Maximum number of results to return"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Maximum number of results to return"
            }
          },
          "additionalProperties": false
        },
        "description": "Get videos from a specific channel"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "playlists_getPlaylist_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "playlistId"
          ],
          "properties": {
            "playlistId": {
              "type": "string",
              "description": "The YouTube playlist ID"
            }
          },
          "additionalProperties": false
        },
        "description": "Get information about a YouTube playlist"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "playlists_getPlaylistItems_mcp_youtubemcp",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "playlistId"
          ],
          "properties": {
            "maxResults": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "number",
                      "description": "Maximum number of results to return"
                    }
                  ],
                  "description": "Maximum number of results to return"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Maximum number of results to return"
            },
            "playlistId": {
              "type": "string",
              "description": "The YouTube playlist ID"
            }
          },
          "additionalProperties": false
        },
        "description": "Get videos in a YouTube playlist"
      }
    },
    {
      "type": "function",
      "function": {
        "name": "read_url_mcp_mcp_jinaai_reader",
        "parameters": {
          "type": "object",
          "$schema": "http://json-schema.org/draft-07/schema#",
          "required": [
            "url"
          ],
          "properties": {
            "url": {
              "type": "string",
              "description": "URL to process"
            },
            "format": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "enum": [
                        "json",
                        "stream"
                      ],
                      "type": "string",
                      "description": "Response format (json or stream)"
                    }
                  ],
                  "description": "Response format (json or stream)"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Response format (json or stream)"
            },
            "timeout": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "number",
                      "description": "Maximum time in seconds to wait for webpage load"
                    }
                  ],
                  "description": "Maximum time in seconds to wait for webpage load"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Maximum time in seconds to wait for webpage load"
            },
            "no_cache": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "boolean",
                      "description": "Bypass cache for fresh results"
                    }
                  ],
                  "description": "Bypass cache for fresh results"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Bypass cache for fresh results"
            },
            "with_iframe": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "boolean",
                      "description": "Include iframe content in response"
                    }
                  ],
                  "description": "Include iframe content in response"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Include iframe content in response"
            },
            "remove_selector": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "string",
                      "description": "CSS selector to exclude specific elements (leave empty to not use this feature)"
                    }
                  ],
                  "description": "CSS selector to exclude specific elements (leave empty to not use this feature)"
                },
                {
                  "type": "null"
                }
              ],
              "description": "CSS selector to exclude specific elements (leave empty to not use this feature)"
            },
            "target_selector": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "string",
                      "description": "CSS selector to focus on specific elements (leave empty to not use this feature)"
                    }
                  ],
                  "description": "CSS selector to focus on specific elements (leave empty to not use this feature)"
                },
                {
                  "type": "null"
                }
              ],
              "description": "CSS selector to focus on specific elements (leave empty to not use this feature)"
            },
            "wait_for_selector": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "string",
                      "description": "CSS selector to wait for specific elements (leave empty to not use this feature)"
                    }
                  ],
                  "description": "CSS selector to wait for specific elements (leave empty to not use this feature)"
                },
                {
                  "type": "null"
                }
              ],
              "description": "CSS selector to wait for specific elements (leave empty to not use this feature)"
            },
            "with_generated_alt": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "boolean",
                      "description": "Add alt text to images lacking captions"
                    }
                  ],
                  "description": "Add alt text to images lacking captions"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Add alt text to images lacking captions"
            },
            "with_links_summary": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "boolean",
                      "description": "Gather all links at the end of response"
                    }
                  ],
                  "description": "Gather all links at the end of response"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Gather all links at the end of response"
            },
            "with_images_summary": {
              "anyOf": [
                {
                  "anyOf": [
                    {
                      "not": {}
                    },
                    {
                      "type": "boolean",
                      "description": "Gather all images at the end of response"
                    }
                  ],
                  "description": "Gather all images at the end of response"
                },
                {
                  "type": "null"
                }
              ],
              "description": "Gather all images at the end of response"
            }
          },
          "additionalProperties": false
        },
        "description": "Convert any URL to LLM-friendly text using Jina.ai Reader"
      }
    }
  ],
  "stream": true,
  "messages": [
    {
      "role": "system",
      "content": "Today's date is 2025-09-03 (3). Please be aware of this date when discussing news articles, events, or any time-sensitive information. Any articles or events dated before this are historical, and any \"future\" events beyond this date should be treated as predictions or speculation, not facts. References to \"recent\" should be evaluated relative to this current date.\n\nWhen encountering measurements in non-metric units, always convert these values to the metric system. When providing these conversions, include the original measurement in brackets. For example:\n- 5 miles should be expressed as 8.05 kilometers (5 miles)\n- 160 pounds should be expressed as 72.57 kilograms (160 pounds)\n- 72°F should be expressed as 22.22°C (72°F)\n- 3 gallons should be expressed as 11.36 liters (3 gallons)\n- 2 inches should be expressed as 5.08 centimeters (2 inches)"
    },
    {
      "role": "user",
      "content": "which nanovna to buy"
    },
    {
      "role": "assistant",
      "content": "",
      "tool_calls": [
        {
          "id": "tooluse_Z-Uip9juRx-JI7VYY6jiFA",
          "type": "function",
          "function": {
            "name": "google",
            "arguments": "{\"query\":\"best nanovna to buy 2025 comparison models\",\"max_results\":7}"
          }
        }
      ]
    },
    {
      "name": "google",
      "role": "tool",
      "content": "{\"kind\":\"customsearch#search\",\"url\":{\"type\":\"application/json\",\"template\":\"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"},\"queries\":{\"request\":[{\"title\":\"Google Custom Search - best nanovna to buy 2025 comparison models\",\"totalResults\":\"17300\",\"searchTerms\":\"best nanovna to buy 2025 comparison models\",\"count\":7,\"startIndex\":1,\"inputEncoding\":\"utf8\",\"outputEncoding\":\"utf8\",\"safe\":\"off\",... (litellm_truncated 10317 chars)",
      "tool_call_id": "tooluse_Z-Uip9juRx-JI7VYY6jiFA"
    }
  ]
}


from docker logs:

litellm                | 04:29:15 - LiteLLM Proxy:INFO: parallel_request_limiter.py:68 - Current Usage of key in this minute: None
litellm                | 04:29:15 - LiteLLM Proxy:INFO: parallel_request_limiter.py:68 - Current Usage of user in this minute: None
litellm                | 04:29:15 - LiteLLM Proxy:INFO: parallel_request_limiter.py:68 - Current Usage of customer in this minute: None
litellm                | 04:29:17 - LiteLLM Router:INFO: router.py:1331 - litellm.acompletion(model=bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0) 200 OK
litellm                | INFO:     172.18.0.14:43892 - "POST /chat/completions HTTP/1.1" 200 OK
litellm                | 04:29:21 - LiteLLM Proxy:INFO: db_spend_update_writer.py:334 - Writing spend log to db - request_id: chatcmpl-53abd95c-ba3e-4590-8f61-79a101de4990, spend: 0.013137000000000001
litellm                | 04:29:21 - LiteLLM Proxy:INFO: db_spend_update_writer.py:1090 - Logged request status: success
litellm                | 04:29:21 - LiteLLM Proxy:INFO: db_spend_update_writer.py:1090 - Logged request status: success
litellm                | 04:29:21 - LiteLLM Proxy:INFO: db_spend_update_writer.py:1090 - Logged request status: success
litellm                | 04:29:22 - LiteLLM Proxy:INFO: parallel_request_limiter.py:68 - Current Usage of key in this minute: {'current_requests': 0, 'current_tpm': 3531, 'current_rpm': 1}
litellm                | 04:29:22 - LiteLLM Proxy:INFO: parallel_request_limiter.py:68 - Current Usage of user in this minute: {'current_requests': 0, 'current_tpm': 3531, 'current_rpm': 1}
litellm                | 04:29:22 - LiteLLM Proxy:INFO: parallel_request_limiter.py:68 - Current Usage of customer in this minute: {'current_requests': 0, 'current_tpm': 3531, 'current_rpm': 1}
litellm                | 04:29:23 - LiteLLM Router:INFO: router.py:1359 - litellm.acompletion(model=bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0) Exception litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}
litellm                | 04:29:23 - LiteLLM Router:INFO: router.py:4041 - Retrying request with num_retries: 2
litellm                | 04:29:25 - LiteLLM Router:INFO: router.py:1359 - litellm.acompletion(model=bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0) Exception litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}
litellm                | 04:29:27 - LiteLLM Router:INFO: router.py:1359 - litellm.acompletion(model=bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0) Exception litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}
litellm                | 04:29:27 - LiteLLM Proxy:INFO: db_spend_update_writer.py:962 - Processed 1 daily user transactions in 0.01s
litellm                | 04:29:27 - LiteLLM Proxy:INFO: db_spend_update_writer.py:962 - Processed 1 daily team transactions in 0.01s
litellm                | 04:29:27 - LiteLLM Proxy:INFO: db_spend_update_writer.py:962 - Processed 2 daily tag transactions in 0.01s
litellm                | 04:29:28 - LiteLLM Router:INFO: router.py:3706 - Trying to fallback b/w models
litellm                | 04:29:28 - LiteLLM Proxy:ERROR: common_request_processing.py:642 - litellm.proxy.proxy_server._handle_llm_api_exception(): Exception occured - litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}. Received Model Group=bedrock/claude-3-7-sonnet-thinking
litellm                | Available Model Group Fallbacks=None LiteLLM Retried: 1 times, LiteLLM Max Retries: 2
litellm                | Traceback (most recent call last):
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 200, in make_call
litellm                |     response = await client.post(
litellm                |                ^^^^^^^^^^^^^^^^^^
litellm                |     ...<5 lines>...
litellm                |     )
litellm                |     ^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/logging_utils.py", line 135, in async_wrapper
litellm                |     result = await func(*args, **kwargs)
litellm                |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 324, in post
litellm                |     raise e
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/llms/custom_httpx/http_handler.py", line 280, in post
litellm                |     response.raise_for_status()
litellm                |     ~~~~~~~~~~~~~~~~~~~~~~~~~^^
litellm                |   File "/usr/lib/python3.13/site-packages/httpx/_models.py", line 829, in raise_for_status
litellm                |     raise HTTPStatusError(message, request=request, response=self)
litellm                | httpx.HTTPStatusError: Client error '400 Bad Request' for url 'https://bedrock-runtime.us-east-2.amazonaws.com/model/us.anthropic.claude-3-7-sonnet-20250219-v1%3A0/converse-stream'
litellm                | For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400
litellm                |
litellm                | During handling of the above exception, another exception occurred:
litellm                |
litellm                | Traceback (most recent call last):
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/main.py", line 544, in acompletion
litellm                |     response = await init_response
litellm                |                ^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/llms/bedrock/chat/converse_handler.py", line 147, in async_streaming
litellm                |     completion_stream = await make_call(
litellm                |                         ^^^^^^^^^^^^^^^^
litellm                |     ...<9 lines>...
litellm                |     )
litellm                |     ^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 263, in make_call
litellm                |     raise BedrockError(status_code=error_code, message=err.response.text)
litellm                | litellm.llms.bedrock.common_utils.BedrockError: {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}
litellm                |
litellm                | During handling of the above exception, another exception occurred:
litellm                |
litellm                | Traceback (most recent call last):
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 4055, in chat_completion
litellm                |     result = await base_llm_response_processor.base_process_llm_request(
litellm                |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |     ...<16 lines>...
litellm                |     )
litellm                |     ^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/proxy/common_request_processing.py", line 436, in base_process_llm_request
litellm                |     responses = await llm_responses
litellm                |                 ^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1083, in acompletion
litellm                |     raise e
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1059, in acompletion
litellm                |     response = await self.async_function_with_fallbacks(**kwargs)
litellm                |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3902, in async_function_with_fallbacks
litellm                |     return await self.async_function_with_fallbacks_common_utils(
litellm                |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |     ...<8 lines>...
litellm                |     )
litellm                |     ^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3860, in async_function_with_fallbacks_common_utils
litellm                |     raise original_exception
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3894, in async_function_with_fallbacks
litellm                |     response = await self.async_function_with_retries(*args, **kwargs)
litellm                |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 4099, in async_function_with_retries
litellm                |     raise original_exception
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3990, in async_function_with_retries
litellm                |     response = await self.make_call(original_function, *args, **kwargs)
litellm                |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 4108, in make_call
litellm                |     response = await response
litellm                |                ^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1364, in _acompletion
litellm                |     raise e
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/router.py", line 1316, in _acompletion
litellm                |     response = await _response
litellm                |                ^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1586, in wrapper_async
litellm                |     raise e
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1437, in wrapper_async
litellm                |     result = await original_function(*args, **kwargs)
litellm                |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/main.py", line 563, in acompletion
litellm                |     raise exception_type(
litellm                |           ~~~~~~~~~~~~~~^
litellm                |         model=model,
litellm                |         ^^^^^^^^^^^^
litellm                |     ...<3 lines>...
litellm                |         extra_kwargs=kwargs,
litellm                |         ^^^^^^^^^^^^^^^^^^^^
litellm                |     )
litellm                |     ^
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2301, in exception_type
litellm                |     raise e
litellm                |   File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 993, in exception_type
litellm                |     raise BadRequestError(
litellm                |     ...<4 lines>...
litellm                |     )
litellm                | litellm.exceptions.BadRequestError: litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}. Received Model Group=bedrock/claude-3-7-sonnet-thinking
litellm                | Available Model Group Fallbacks=None LiteLLM Retried: 1 times, LiteLLM Max Retries: 2

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

ghcr.io/berriai/litellm main-stable 529382e4d5a2 8 days ago 2.19GB

Twitter / LinkedIn details

No response

tinuva avatar Sep 03 '25 04:09 tinuva

I get a very similar error when using Open WebUI. For me, it occurs:

  • In an Open WebUI chat
  • ONLY with thinking enabled
  • On anthropic models
  • ONLY with native tool calling (default tool calling is a workaround)
  • After a successful tool call - so when OWUI sends the tool call result to the model
Error processing request: Error code: 400 - {'error': {'message': 'litellm.BadRequestError: AnthropicException - b\'{"type":"error","error":{"type":"invalid_request_error","message":"messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `text`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"},"request_id":"req_011CSnDc4"}\'.

Open WebUI is talking to the LiteLLM native endpoint.

I assume LiteLLM needs to change how it transforms the request before forwarding it to the anthropic API.

nootreeno avatar Sep 04 '25 04:09 nootreeno

Just encountered the same issue. The error I encountered:

litellm.BadRequestError: AnthropicException - b'{"type":"error","error":{"type":"invalid_request_error","message":"messages.1.content.0.type: Expected thinking or redacted_thinking, but found tool_use. When thinking is enabled, a final assistant message must start with a thinking block (preceeding the lastmost set of tool_use and tool_result blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable thinking. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"},"request_id":"req_011CTBmpZ7F32HZ8gBesY5b9"}'

preeteshjain avatar Sep 16 '25 09:09 preeteshjain

Facing the same issue when using claude sonnet 4 hosted on AWS Bedrock (bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0) with python SDK. It prevents us from enabling thinking for the claude models.

ikaliuzh avatar Sep 19 '25 13:09 ikaliuzh

Looks like this issue happens on Claude reasoning models, not specific to Bedrock or Anthropic API

preeteshjain avatar Sep 23 '25 11:09 preeteshjain

@ishaan-jaff @krrishdholakia Can you guys please take a look into this. This is happening quite for sometime and preventing tool use with claude reasoning models.

preeteshjain avatar Sep 23 '25 11:09 preeteshjain

Related issue: https://github.com/BerriAI/litellm/issues/9020

@ishaan-jaff @krrishdholakia any updates? Seems like lot of folks are experiencing this issue.

preeteshjain avatar Sep 25 '25 09:09 preeteshjain

I am encountering the same issue when using Opus 4.1 model from Azure Databricks.

2025-09-27 17:55:43.000 | ValueError: Error calling litellm.acompletion for non-Anthropic model: litellm.BadRequestError: databricksException - {"error_code":"BAD_REQUEST","message":"{"message":"messages.1.content.0.type: Expected thinking or redacted_thinking, but found text. When thinking is enabled, a final assistant message must start with a thinking block (preceeding the lastmost set of tool_use and tool_result blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable thinking. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking"}"}

blrchen avatar Sep 27 '25 11:09 blrchen

I am also encountering the same issue when using both models Opus 4.1 / Sonnet 4 from Azure Databricks.

File "python3.13/site-packages/litellm/main.py", line 573, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2301, in exception_type
    raise e
  File "python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 823, in exception_type
    raise BadRequestError(
    ...<3 lines>...
    )
litellm.exceptions.BadRequestError: litellm.BadRequestError: databricksException - {"error_code":"BAD_REQUEST","message":"{\"message\":\"messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking\"}"}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "python3.13/site-packages/litellm/proxy/anthropic_endpoints/endpoints.py", line 153, in anthropic_response
    responses = await llm_responses
                ^^^^^^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/router.py", line 3569, in async_wrapper
    return await self._ageneric_api_call_with_fallbacks(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
    )
    ^
  File "python3.13/site-packages/litellm/router.py", line 2704, in _ageneric_api_call_with_fallbacks
    raise e
  File "python3.13/site-packages/litellm/router.py", line 2691, in _ageneric_api_call_with_fallbacks
    response = await self.async_function_with_fallbacks(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/router.py", line 3895, in async_function_with_fallbacks
    return await self.async_function_with_fallbacks_common_utils(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
    )
    ^
  File "python3.13/site-packages/litellm/router.py", line 3853, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "python3.13/site-packages/litellm/router.py", line 3887, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/router.py", line 4092, in async_function_with_retries
    raise original_exception
  File "python3.13/site-packages/litellm/router.py", line 3983, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/router.py", line 4101, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/router.py", line 2783, in _ageneric_api_call_with_fallbacks_helper
    raise e
  File "python3.13/site-packages/litellm/router.py", line 2769, in _ageneric_api_call_with_fallbacks_helper
    response = await response  # type: ignore
               ^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/utils.py", line 1597, in wrapper_async
    raise e
  File "python3.13/site-packages/litellm/utils.py", line 1448, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/llms/anthropic/experimental_pass_through/messages/handler.py", line 90, in anthropic_messages
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "python3.13/site-packages/litellm/llms/anthropic/experimental_pass_through/adapters/handler.py", line 179, in async_anthropic_messages_handler
    raise ValueError(
        f"Error calling litellm.acompletion for non-Anthropic model: {str(e)}"
    )


ValueError: Error calling litellm.acompletion for non-Anthropic model: litellm.BadRequestError: databricksException - {"error_code":"BAD_REQUEST","message":"{\"message\":\"messages.1.content.0.type: Expected `thinking` or `redacted_thinking`, but found `tool_use`. When `thinking` is enabled, a final `assistant` message must start with a thinking block (preceeding the lastmost set of `tool_use` and `tool_result` blocks). We recommend you include thinking blocks from previous turns. To avoid this requirement, disable `thinking`. Please consult our documentation at https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking\"}"}

samtsangbiz avatar Sep 28 '25 08:09 samtsangbiz

+1 here. please fix this if possible, thanks

chenxizhang avatar Sep 30 '25 06:09 chenxizhang

cc: @Sameerlite can you please look into this?

krrishdholakia avatar Oct 07 '25 03:10 krrishdholakia

Sure

Sameerlite avatar Oct 07 '25 08:10 Sameerlite

Sure

@Sameerlite have you looked into this yet? We are struggling a lot with it. I might take a look if you haven't gotten to it yet, but that will happen later.

MalteHB avatar Oct 08 '25 20:10 MalteHB

How are people actually using both OpenAI and Anthropic models simultaneously with tool calls with the LiteLLM proxy?

MalteHB avatar Oct 08 '25 20:10 MalteHB

We encountered this using Claude code client and bedrock. It's weird that us-west-2 have this issue, while we use us-east-1 sonnet 4 the issue goes away.

zilingzhang avatar Oct 08 '25 22:10 zilingzhang

~~v1.77.7-stable seems to fixed it for us. Thanks!~~

No, it still fails, litellm.BadRequestError: BedrockException - {"message":"The model returned the following errors: messages.1.content.0.type: Expected thinking or redacted_thinking, but found text. When thinking is enabled, a final assistant message must start with a thinking block. We recommend you include thinking blocks from previous turns. To avoid this requirement, disable thinking. Please consult our documentation at https://docs.claude.com/en/docs/build-with-claude/extended-thinking"}.

zilingzhang avatar Oct 12 '25 18:10 zilingzhang

Closing it this week. The PR for error through v1/messages endpoint has been raised for streaming/non-streaming

Sameerlite avatar Oct 13 '25 14:10 Sameerlite

Closing it this week. The PR for error through v1/messages endpoint has been raised for streaming/non-streaming

What does that mean? We are still seeing errors, and it means that for the proxy, we cannot use OpenAI and Anthropic models without screwing up the context with stringified tool calls. @ishaan-jaff @krrishdholakia are you aware of this?

MalteHB avatar Oct 20 '25 10:10 MalteHB

ack @Sameerlite can you link the pr to this ticket, so we know it's closing this issue

krrishdholakia avatar Oct 20 '25 19:10 krrishdholakia

ack @Sameerlite can you link the pr to this ticket, so we know it's closing this issue

But it is still an issue, sorry

MalteHB avatar Oct 20 '25 19:10 MalteHB

https://github.com/BerriAI/litellm/pull/15501 https://github.com/BerriAI/litellm/pull/15220

These PR solves the issue faced when using v1/messages

Sameerlite avatar Oct 20 '25 19:10 Sameerlite

@Sameerlite is there any litellm version I can used to solve this issue?

irfansofyana avatar Oct 28 '25 03:10 irfansofyana

Hi @krrishdholakia @ishaan-jaff This issue is still happening

In which release can we expect a fix?

preeteshjain avatar Nov 07 '25 07:11 preeteshjain

Hi @krrishdholakia @ishaan-jaff This issue is still happening

In which release can we expect a fix?

MalteHB avatar Nov 19 '25 13:11 MalteHB

I noticed that Databricks now has native support for Claude Code. Just click the Getting Started button (see screenshot below) and follow the steps. So if you are using Claude models from Databricks with Claude Code, this proxy is no longer needed.

Image

blrchen avatar Nov 23 '25 09:11 blrchen

I don't think this issue is resolved when thinking and tooluse is enabled.

vaughngit avatar Nov 25 '25 04:11 vaughngit

Same issue in Vertex (Google) AI Claude when using /v1/chat/completions with thinking enabled and tool use.

mhjort avatar Nov 25 '25 08:11 mhjort