agents icon indicating copy to clipboard operation
agents copied to clipboard

fix(llm): sanitize control characters in function call JSON arguments

Open ArpitKotecha opened this issue 1 month ago • 1 comments

Description

Problem

LLMs sometimes generate function call JSON with literal control characters (e.g., newlines, tabs) inside string values. For example:

{"prompt": "A timeline showing:
- Event 1
- Event 2"}

The literal newline violates the JSON spec, causing pydantic_core.from_json() to fail with:

ValueError: control character (\u0000-\u001F) found while parsing a string

This breaks function tool execution when the LLM outputs multi-line content in tool arguments.

Solution

Add a _sanitize_json_control_chars() helper that escapes control characters within JSON string values before parsing:

  • \n\\n
  • \r\\r
  • \t\\t
  • Other control chars → \\uXXXX

The function preserves already-escaped sequences and only modifies content inside JSON string values.

Changes

  • Added _sanitize_json_control_chars() helper function in utils.py
  • Modified prepare_function_arguments() to sanitize JSON before calling from_json()

Testing

Tested with real-world LLM output containing multi-line prompts that previously caused the error.

ArpitKotecha avatar Dec 08 '25 12:12 ArpitKotecha