autogen Unable to view outputs of tool based agents in tracing tools like LangFuse, OpenLit and Microsoft Tracing

What happened?

Describe the bug

We are unable to view outputs of tool based agents in Tracing tools like LangFuse, OpenLit and Microsoft Tracing.
We are encountering an error and unable to export traces when using the code provided in #5992

To Reproduce

Code for reproducing Bug 1

import os
import base64
import openlit
import asyncio
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry import trace
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import MaxMessageTermination, TextMentionTermination
from autogen_agentchat.messages import AgentEvent, ChatMessage
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
from autogen_core import SingleThreadedAgentRuntime
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from dotenv import load_dotenv

load_dotenv()

os.environ["LANGFUSE_PUBLIC_KEY"] = "..."
os.environ["LANGFUSE_SECRET_KEY"] = "..."
os.environ["LANGFUSE_HOST"] = "..."
LANGFUSE_AUTH = base64.b64encode(
    f"{os.environ.get('LANGFUSE_PUBLIC_KEY')}:{os.environ.get('LANGFUSE_SECRET_KEY')}".encode()
).decode()
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = os.environ.get("LANGFUSE_HOST") + "/api/public/otel"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"
 
trace_provider = TracerProvider()
trace_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(trace_provider)
tracer = trace.get_tracer(__name__)
openlit.init(tracer=tracer, disable_batch=True, disable_metrics=True)

async def Moral() -> str:
    return "No morals to this story"

async def main():
    API_KEY = os.getenv("api_key")
    Model_Name = os.getenv("model-name")
    Api_Version = os.getenv("api-version")
    Azure_Endpoint = os.getenv("azure_endpoint")
    Deployment_Name = os.getenv("deployment-name")

    az_model_client = AzureOpenAIChatCompletionClient(
        azure_deployment=Deployment_Name,
        model=Model_Name,
        api_version=Api_Version,
        azure_endpoint=Azure_Endpoint,
        api_key=API_KEY
    )
    planning_agent = AssistantAgent(
    "PlanningAgent",
    description="An agent for planning tasks, this agent should be the first to engage when given a new task.",
    model_client=az_model_client,
    system_message="""
        You are a planning agent.
        Your job is to break down complex tasks into smaller, manageable subtasks.
        Your team members are:
            Story_writer: Writes story and make corrections.
            Story_reviewer: Checks if the story is for kids and Provides constructive feedback on Kids stories to add a positive impactful ending. It doesn't write the story, only provide feedback and improvements.
            Story_moral: Finally, adds the moral to the story.

        You only plan and delegate tasks - you do not execute them yourself. You can engage team members multiple times so that a perfect story is provided.

        When assigning tasks, use this format:
        1. <agent> : <task>

        After all tasks are complete, summarize the findings and end with "TERMINATE".
        """,
    )

    # Create the Writer agent.
    Story_writer = AssistantAgent(
        "Story_writer",
        model_client=az_model_client,
        system_message="You are a helpful AI assistant which write the story. Keep the story short."
    )

    # Create the Reviewer agent.
    Story_reviewer = AssistantAgent(
        "Story_reviewer",
        model_client=az_model_client,
        system_message="You are a helpful AI assistant which checks if the story is for kids and provides constructive feedback on Kids stories to have a postive impactful ending",
    )

    # Story Moral Agent.
    Story_moral = AssistantAgent(
        "Story_moral",
        model_client=az_model_client,
        system_message="""You are a helpful AI assistant which add the moral of the story using the 'Moral' tool, it has to be written by a seperation ' ========moral of the story==========='""",
        tools=[Moral],
        reflect_on_tool_use=True
    )

    text_mention_termination = TextMentionTermination("TERMINATE")
    max_messages_termination = MaxMessageTermination(max_messages=10)
    termination = text_mention_termination | max_messages_termination

    runtime = SingleThreadedAgentRuntime(tracer_provider=trace_provider)

    with tracer.start_as_current_span("Sample-Trace") as span:
        runtime.start()
        team = SelectorGroupChat(
            [planning_agent, Story_writer, Story_reviewer, Story_moral],
            model_client=az_model_client,
            termination_condition=termination,
            runtime = runtime
        )
        span.set_attribute("langfuse.user.id", "user-A")
        span.set_attribute("langfuse.session.id", "28")
        span.set_attribute("langfuse.tags", ["semantic-kernel", "demo-9"])
        span.set_attribute("langfuse.prompt.name", "test-9")
        await Console(
            team.run_stream(task="write a story on rocket crash")
        )
await runtime.close()
await model_client.close()

if __name__ == "__main__":
    asyncio.run(main())

Code for reproducing Bug 2

import os
import base64
import asyncio
import openlit
from dotenv import load_dotenv
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.sdk.resources import Resource
from autogen_core import SingleThreadedAgentRuntime
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.conditions import MaxMessageTermination, TextMentionTermination
from autogen_agentchat.teams import SelectorGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient

load_dotenv()
os.environ["LANGFUSE_PUBLIC_KEY"] = "..."
os.environ["LANGFUSE_SECRET_KEY"] = "..." 
LANGFUSE_AUTH = base64.b64encode(
    f"{os.environ.get('LANGFUSE_PUBLIC_KEY')}:{os.environ.get('LANGFUSE_SECRET_KEY')}".encode()
).decode()

def configure_oltp_tracing():
    langfuse_exporter = OTLPSpanExporter(endpoint="...", headers={"Authorization" : f"Basic {LANGFUSE_AUTH}"})
    tracer_provider = TracerProvider(resource=Resource({"service.name" : "autogen-test-agentchat"}))
    span_processor = SimpleSpanProcessor(langfuse_exporter)
    tracer_provider.add_span_processor(span_processor)
    trace.set_tracer_provider(tracer_provider)
    return tracer_provider

def search_web_tool(query: str) -> str:
    if "2006-2007" in query:
        return """Here are the total points scored by Miami Heat players in the 2006-2007 season:
        Udonis Haslem: 844 points
        Dwayne Wade: 1397 points
        James Posey: 550 points
        ...
        """
    elif "2007-2008" in query:
        return "The number of total rebounds for Dwayne Wade in the Miami Heat season 2007-2008 is 214."
    elif "2008-2009" in query:
        return "The number of total rebounds for Dwayne Wade in the Miami Heat season 2008-2009 is 398."
    return "No data found."

def percentage_change_tool(start: float, end: float) -> float:
    return ((end - start) / start) * 100

async def main():
    API_KEY = os.getenv("api_key")
    Model_Name = os.getenv("model-name")
    Deployment_Name = os.getenv("deployment-name")
    Api_Version = os.getenv("api-version")
    Azure_Endpoint = os.getenv("azure_endpoint")
    model_client = AzureOpenAIChatCompletionClient(
        azure_deployment=Deployment_Name,
        model=Model_Name,
        api_version=Api_Version,
        azure_endpoint=Azure_Endpoint,
        api_key=API_KEY
    )

    planning_agent = AssistantAgent(
        "PlanningAgent",
        description="An agent for planning tasks, this agent should be the first to engage when given a new task.",
        model_client=model_client,
        system_message="""
        You are a planning agent.
        Your job is to break down complex tasks into smaller, manageable subtasks.
        Your team members are:
            WebSearchAgent: Searches for information
            DataAnalystAgent: Performs calculations

        You only plan and delegate tasks - you do not execute them yourself.

        When assigning tasks, use this format:
        1. <agent> : <task>

        After all tasks are complete, summarize the findings and end with "TERMINATE".
        """,
    )

    web_search_agent = AssistantAgent(
        "WebSearchAgent",
        description="An agent for searching information on the web.",
        tools=[search_web_tool],
        model_client=model_client,
        system_message="""
        You are a web search agent.
        Your only tool is search_tool - use it to find information.
        You make only one search call at a time.
        Once you have the results, you never do calculations based on them.
        """,
    )

    data_analyst_agent = AssistantAgent(
        "DataAnalystAgent",
        description="An agent for performing calculations.",
        model_client=model_client,
        tools=[percentage_change_tool],
        system_message="""
        You are a data analyst.
        Given the tasks you have been assigned, you should analyze the data and provide results using the tools provided.
        If you have not seen the data, ask for it.
        """,
    )

    text_mention_termination = TextMentionTermination("TERMINATE")
    max_messages_termination = MaxMessageTermination(max_messages=25)
    termination = text_mention_termination | max_messages_termination

    selector_prompt = """Select an agent to perform task.

    {roles}

    Current conversation context:
    {history}

    Read the above conversation, then select an agent from {participants} to perform the next task.
    Make sure the planner agent has assigned tasks before other agents start working.
    Only select one agent.
    """

    task = "Who was the Miami Heat player with the highest points in the 2006-2007 season, and what was the percentage change in his total rebounds between the 2007-2008 and 2008-2009 seasons?"
    tracer_provider = configure_oltp_tracing()
    openlit.init(tracer=tracer_provider, disable_batch=True, disable_metrics=True)
    runtime = SingleThreadedAgentRuntime(tracer_provider=tracer_provider)
    tracer = trace.get_tracer("autogen-test-agentchat")
    with tracer.start_as_current_span("runtime") as span:
        runtime.start()
        team = SelectorGroupChat(
            [planning_agent, web_search_agent, data_analyst_agent],
            model_client=model_client,
            termination_condition=termination,
            selector_prompt=selector_prompt,
            allow_repeated_speaker=True,  # Allow an agent to speak multiple turns in a row.
            runtime=runtime,
        )
        span.set_attribute("langfuse.user.id", "user-A")
        span.set_attribute("langfuse.session.id", "26")
        span.set_attribute("langfuse.tags", ["semantic-kernel", "demo-9"])
        span.set_attribute("langfuse.prompt.name", "test-9")
        await Console(team.run_stream(task=task))
    
    await runtime.close()
    await model_client.close()

asyncio.run(main())

Screenshots

Screenshot for bug 1

Screenshot for bug 2

Tags #5995 @victordibia

Which packages was the bug in?

Python AgentChat (autogen-agentchat>=0.4.0)

AutoGen library version.

0.4.9.2

Other library version.

No response

Model used

gpt-4o

Model provider

Azure OpenAI

Other model provider

No response

Python version

3.11

.NET version

None

Operating system

Windows

Mar 24 '25 20:03 Arjun-Prakash10

Hi @Arjun-Prakash10 ,

Thanks for the issue.

In your screenshot, it seems you can see the autogen runtime events (e.g, autogen create ) but not openai.chat.completions. This is likely because the autogen runtime does not automatically log the chatcompletions content, this has to be specified.

Can you add the following to your code and see if that logs the chat completions?

from opentelemetry.instrumentation.openai import OpenAIInstrumentor

OpenAIInstrumentor().instrument()

The examples we provide in the docs here shows implementation with Jaeger UI. Are you seeing similar issues with Jaeger also?

I will try out LANGFUSE over the next couple of days and share any findings.

Mar 24 '25 20:03 victordibia

Hello @victordibia, I am able to see the output after using OpenAIInstrumentor. Thanks a lot for your support. Can you also please try langfuse from your end, even though I am able to see output, there are lots of warnings and some errors displayed in the langfuse interface.

Mar 27 '25 20:03 Arjun-Prakash10

Hello, @victordibia! Thank you for the great work. Are there any updates on this issue? We're facing the same problem with the OllamaChatCompletionClient. Adding the equivalent instrumentation suggestion for Ollama allowed us to see the output in the dashboard, but we still see the error on the chat function call when the model is reflecting on the tool output - clicking on it shows an empty input and output.

To reach this point, we followed the langfuse autogen integration guide, then gave the agent a tool and asked them to reflect_on_tool_usage.

Package versions:

autogen-agentchat: 0.6.4
langfuse: 3.2.1
openlit: 1.34.35

Jul 28 '25 08:07 Nandinski