openinference icon indicating copy to clipboard operation
openinference copied to clipboard

[bug] Traces are overwhelmed by nested metadata in Agno

Open hudyweas opened this issue 4 months ago • 10 comments

Describe the bug AgnoInstrumentor currently produces spans containing extremely verbose metadata - such as nested objects, full config dumps, which makes the trace data practically unusable for debugging or observability. The noise overwhelms key insights, making the tracing ineffective and resulting in incorrect tracing costs.

Some numbers to visualise the problem:

  • over 20 000 characters of metadata in user input
  • 212 total json lines in the output - 209 are metadata. The real model response was only 3 sentences.

To Reproduce Use AgnoInstrumentor().instrument()

Expected Behaviour Spans should show only essential parts like user message, output text. Metadata should be stored in metadata.

Actual Behaviour Tracing UIs display spans exceeding 200 lines or more, filled with nested agent configs, state, and irrelevant fields. Searching for the actual user message becomes extremely difficult, practically impossible.

Screenshots Input message: Image

Output message: Image

Additional context We are using Langfuse for tracing.

hudyweas avatar Aug 06 '25 18:08 hudyweas

Hi @hudyweas! I'm Dosu and I’m helping the openinference team.

This is a known limitation: AgnoInstrumentor currently serializes and attaches all input arguments, agent attributes, and context metadata directly to span attributes—often deeply nested and including full config dumps—which leads to extremely verbose and unwieldy trace data. There are no built-in configuration options or filters to limit, filter, or truncate which metadata fields are included in spans; everything is included as-is except for a small set of sensitive model parameters that are redacted (source). The _flatten function is responsible for recursively flattening all nested mappings and lists into dot-separated span attribute keys, which can result in hundreds of lines of metadata if the input is complex (source).

Best practices (and user requests in related issues) suggest that only essential information—like user message and output text—should be included as top-level span attributes, with additional metadata stored separately or in a less prominent way, but this is not currently implemented. Workarounds used in other instrumentors include manually converting outputs to dicts before they reach the instrumentor or patching the serialization logic locally, but there is no built-in solution for AgnoInstrumentor (source).

If you need to reduce verbosity immediately, you would need to preprocess/filter your arguments before they reach the instrumentor, or patch the wrappers locally. This is a recurring pain point, and it would be beneficial for maintainers to consider adding configuration hooks or allowlists for metadata fields in future releases.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other  Chat with Dosu Join Discord Share on X

dosubot[bot] avatar Aug 06 '25 18:08 dosubot[bot]

Hi @hudyweas ,

Thanks for filing the ticket.

It'd be very helpful for us to have a runnable code snippet to reproduce the input/output messages you had in your screenshot. And what exactly are the span's json paths to attributes that are verbose?

caroger avatar Aug 07 '25 17:08 caroger

Thanks for the response @caroger,

The code is basic; we are doing a PoC, and I prefer not to share the company code itself. But I give deeper insight into what makes the current monitoring a tough experience.

Image

At the top level:

Input:

  • We got a run response object with arguments like tool calls, session id, run id, model, etc
  • run message with a full Run Messages object that contains a system message and multiple arguments like:

, name=None, tool_call_id=None, tool_calls=None, audio=None, images=None, videos=None, files=None, audio_output=None, image_output=None, thinking=None, redacted_thinking=None, provider_data=None, citations=None, reasoning_content=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, add_to_agent_memory=True, from_history=False, metrics=MessageMetrics(input_tokens=0, output_tokens=0, total_tokens=0, audio_tokens=0, input_audio_tokens=0, output_audio_tokens=0, cached_tokens=0, cache_write_tokens=0, reasoning_tokens=0, prompt_tokens=0, completion_tokens=0, prompt_tokens_details=None, completion_tokens_details=None, additional_metrics=None, time=None, time_to_first_token=None, timer=None)

I found these arguments like 3 or 4 times in this nested object.

  • We got the session ID
  • stream_intermediate_steps boolean.

None of that at this level of the trace is important - the only thing that matters is the user message, and I'm unable to find it. I'm guessing what the message was from the model output.

Output:

  • We got, I think, full agent output together with the metadata like system message, token counts, etc, but what matters at this point is the content str:

content: "I can't update the project name without your approval. Would you like to change the project name to "Deliver an engineer to the Moon"?"

Even though we can easily read the message output, this is a top-level trace, and full 212 lines of the output are visible in the session pages, as it is stated in the output.

For the second level, there are pretty much the same problems - if you want to check what message an agent sent to the team member, you cannot find it because it is lost in the history, system messages, arguments like those shown above, etc.

For the Azure invokes - there is a lot, but it is fine, as you are only looking at this when you need to debug, so it is not too often

hudyweas avatar Aug 08 '25 09:08 hudyweas

hey @dirkbrnd , would you be able to take a look at this issue?

caroger avatar Aug 08 '25 17:08 caroger

any updates?

hudyweas avatar Aug 27 '25 17:08 hudyweas

Im also facing this problem

thomas100z avatar Aug 28 '25 07:08 thomas100z

Any update ?

leviethung2103 avatar Oct 06 '25 07:10 leviethung2103

So sorry for the delay folks! Thanks @hudyweas for the PR! We'll take a look this week. I've also pinged @dirkbrnd from Agno as well to take a look.

nate-mar avatar Oct 06 '25 22:10 nate-mar

@leviethung2103 @hudyweas We are taking a look and introducing our own PR to address!

dirkbrnd avatar Oct 07 '25 10:10 dirkbrnd

Hey @hudyweas

We have a PR ready that addresses this together with a bunch of other updates. If you don't mind, the PR you linked can be closed. https://github.com/Arize-ai/openinference/pull/2143

willemcdejongh avatar Oct 08 '25 10:10 willemcdejongh