llama-stack icon indicating copy to clipboard operation
llama-stack copied to clipboard

create a span when a client or server tool is called

Open frzifus opened this issue 5 months ago • 0 comments

🚀 Describe the new functionality needed

I would like to use OpenTelemetry-Tracing to track my agent's activities. I created an agent as follows and registered a local 'calculator' and 'builtin::websearch' configured on llama-stack.

Agent

    agent = Agent(
        client,
        model=MODEL_ID,
        instructions="You are a helpful assistant. Use tools when necessary.",
        sampling_params={
            "strategy": {"type": "top_p", "temperature": 1.0, "top_p": 0.9},
        },
        tools=[calculator, "builtin::websearch"],
    )

Calculator

@client_tool
def calculator(x: float, y: float, operation: str) -> dict:
    """
    Perform a basic arithmetic operation on two numbers.

    :param x: First number
    :param y: Second number
    :param operation: The operation to perform: 'add', 'subtract', 'multiply', or 'divide'
    :returns: A dictionary with keys 'success' and either 'result' or 'error'
    """
    with tracer.start_as_current_span("tool.calculator") as span:
        span.set_attribute("calculator.x", x)
        span.set_attribute("calculator.y", y)
        span.set_attribute("calculator.operation", operation)

        print(f"Call calculator: {x} {operation}, {y}", file=sys.stdout, flush=True)

        try:
            if operation == "add":
                result = x + y
            elif operation == "subtract":
                result = x - y
            elif operation == "multiply":
                result = x * y
            elif operation == "divide":
                if y == 0:
                    error_msg = "Cannot divide by zero"
                    span.set_attribute("calculator.error", error_msg)
                    return {"success": False, "error": error_msg}
                result = x / y
            else:
                error_msg = "Invalid operation"
                span.set_attribute("calculator.error", error_msg)
                return {"success": False, "error": error_msg}

            span.set_attribute("calculator.result", result)
            return {"success": True, "result": result}
        except Exception as e:
            error_msg = str(e)
            span.set_attribute("calculator.exception", error_msg)
            return {"success": False, "error": error_msg}

The agent offers a OpenAI compatible endpoint, thats why I then can do a turn using curl:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vllm",
    "messages": [
      {"role": "user", "content": "What is 40+30?"}
    ]
  }'

This results in a calculator call and the following response:

{"id":"chatcmpl-1234","object":"chat.completion","created":1747758297,"model":"vllm","choices":[{"index":0,"message":{"role":"assistant","content":"[{\"name\": \"calculator\", \"arguments\": {\"x\": 40, \"y\": 30, \"operation\": \"add\"}}]"}

Inspecting the llama-stack traces, I dont see a span indicating that the calculator or the websearch have been triggered.

Image

Creating a seperate span for each tool execution would help to understand the flow of the execution.

💡 Why is this needed? What if we don't build it?

I identify where in the LlamaStack tools are called and create a separate OpenTelemetry span for these operations.

Other thoughts

It would be even better when the tool exeuction on llama-stack would propagate the trace context to the tool that is called. I assume that belongs somehow to:

  • https://github.com/meta-llama/llama-stack/issues/2154

frzifus avatar May 21 '25 12:05 frzifus