adk-web icon indicating copy to clipboard operation
adk-web copied to clipboard

Evals silently fail in the adk-web frontend if the tool trajectory check fails

Open vadakattu opened this issue 2 months ago • 1 comments

Describe the bug Loading evaluation results for a particular eval case fails if the trajectory check does not pass.

To Reproduce Example Agent

from google.adk.agents import Agent


def commerce_system(request: str) -> list[dict[str, str | int]]:
    """Responds to requests for 'promotions' data"""
    if request.lower() == "promotions":
        return [{"name": "Free delivery", "min_price": 50}]
    else:
        return []


prompt = """You are a shopping assistant that provides advice about **promotions**"""

root_agent = Agent(
    name="root_agent",
    model="gemini-2.5-flash",
    instruction=prompt,
    tools=[commerce_system],
)

Example evalset

{
    "eval_set_id": "bug",
    "eval_cases": [
        {
            "eval_id": "fails",
            "conversation": [
                {
                    "user_content": {
                        "parts": [
                            {
                                "text": "what promotions are there?"
                            }
                        ],
                        "role": "user"
                    },
                    "final_response": {
                        "parts": [
                            {
                                "text": "There is a promotion for free delivery on orders over $50."
                            }
                        ],
                        "role": "model"
                    },
                    "intermediate_data": {
                        "invocation_events": [
                            {
                                "author": "root_agent",
                                "content": {
                                    "parts": [
                                        {
                                            "function_call": {
                                                "args": {
                                                    "request": "PROMOTIONS"
                                                },
                                                "name": "commerce_system"
                                            }
                                        }
                                    ],
                                    "role": "model"
                                }
                            },
                            {
                                "author": "root_agent",
                                "content": {
                                    "parts": [
                                        {
                                            "function_response": {
                                                "name": "commerce_system",
                                                "response": {
                                                    "result": [
                                                        {
                                                            "name": "Free delivery",
                                                            "min_price": 50
                                                        }
                                                    ]
                                                }
                                            }
                                        }
                                    ],
                                    "role": "user"
                                }
                            }
                        ]
                    },
                    "creation_timestamp": 1760829983.020511
                }
            ],
            "session_input": {
                "app_name": "app",
                "user_id": "user"
            },
            "creation_timestamp": 1760830006.5009038
        },
        {
            "eval_id": "works",
            "conversation": [
                {
                    "user_content": {
                        "parts": [
                            {
                                "text": "What promotions are there?"
                            }
                        ],
                        "role": "user"
                    },
                    "final_response": {
                        "parts": [
                            {
                                "text": "There is a promotion for free delivery on orders over $50."
                            }
                        ],
                        "role": "model"
                    },
                    "intermediate_data": {
                        "invocation_events": [
                            {
                                "author": "root_agent",
                                "content": {
                                    "parts": [
                                        {
                                            "function_call": {
                                                "args": {
                                                    "request": "promotions"
                                                },
                                                "name": "commerce_system"
                                            }
                                        }
                                    ],
                                    "role": "model"
                                }
                            },
                            {
                                "author": "root_agent",
                                "content": {
                                    "parts": [
                                        {
                                            "function_response": {
                                                "name": "commerce_system",
                                                "response": {
                                                    "result": [
                                                        {
                                                            "name": "Free delivery",
                                                            "min_price": 50
                                                        }
                                                    ]
                                                }
                                            }
                                        }
                                    ],
                                    "role": "user"
                                }
                            }
                        ]
                    },
                    "creation_timestamp": 1760829983.020511
                }
            ],
            "session_input": {
                "app_name": "app",
                "user_id": "user"
            },
            "creation_timestamp": 1760830006.5009038
        }
    ],
    "creation_timestamp": 1760829999.966551
}

Expected behavior After executing the evaluations, you should see the session replayed with ❌ ✅ markers indicating success, with a comparison of the reference vs predicted.

Actual behaviour The test case where the function args match precisely will be selectable, however test cases where the tools args do not match the expected value, will remain clickable but do nothing.

If the final response does not match, but the trajectory does pass, the eval case will load, however the tool calls will also be given ❌ ❌ indicators, despite being valid calls.

Stack trace in the browser:

main-FFZEKOTG.js:18 ERROR TypeError: e is not iterable
    at t.formatToolUses (main-FFZEKOTG.js:3877:57149)
    at t.addEvalFieldsToBotEvent (main-FFZEKOTG.js:3877:57770)
    at t.addEvalCaseResultToEvents (main-FFZEKOTG.js:3877:57563)
    at Object.next (main-FFZEKOTG.js:3877:58443)
    at Yv.next (main-FFZEKOTG.js:14:3115)
    at d0._next (main-FFZEKOTG.js:14:2796)
    at d0.next (main-FFZEKOTG.js:14:2523)
    at main-FFZEKOTG.js:14:17885
    at zv._next (main-FFZEKOTG.js:14:5438)
    at zv.next (main-FFZEKOTG.js:14:2523)

Desktop (please complete the following information):

  • ADK version(pip show google-adk): 1.17 (and applies to past few versions)

Model Information:

  • gemini-2.5-flash

Additional Information

Seems to be because the intermediateData does not populate toolUses, they are only represented in intermediate_responses

vadakattu avatar Oct 23 '25 16:10 vadakattu

I'm also encountering the same issue. Could you please let me know if there's an expected timeline for fixing this bug? @wyf7107

main-FFZEKOTG.js:18 ERROR TypeError: e is not iterable
    at t.formatToolUses (main-FFZEKOTG.js:3877:57149)
    at t.addEvalFieldsToBotEvent (main-FFZEKOTG.js:3877:57770)
    at t.addEvalCaseResultToEvents (main-FFZEKOTG.js:3877:57563)
    at t.getHistorySession (main-FFZEKOTG.js:3877:59844)
    at main-FFZEKOTG.js:3877:51607
    at eY (main-FFZEKOTG.js:18:105622)
    at i (main-FFZEKOTG.js:18:105468)
    at HTMLButtonElement.<anonymous> (main-FFZEKOTG.js:22:59150)
    at f.invokeTask (polyfills-B6TNHZQ6.js:16:7505)
    at Object.onInvokeTask (main-FFZEKOTG.js:18:24875)
Image

diseng avatar Nov 02 '25 15:11 diseng