[Bug]: LiteLLM Instrumentation not working for Responses API
Contact Details
No response
π¦ Package Version
0.4.14
ποΈ Framework Version
1.72.0
π Describe the Bug
Neither normal instrumentation with LiteLLM or using the success_callback arg are working properly for LiteLLM for litellm.response (they work for litellm.completions).
Success callback not working (see docs https://docs.litellm.ai/docs/observability/agentops_integration):
In [1]: import agentops
i
In [2]: import litellm
In [3]: litellm.success_callback = ["agentops"]
In [4]: agentops.init(auto_start_session=False)
...:
In [5]:
...:
...: # Non-streaming response
...: response = litellm.responses(
...: model="openai/gpt-4o",
...: input="Tell me a three sentence bedtime story about a unicorn.",
...: max_output_tokens=100
...: )
...:
...: print(response)
-> No trace detected in AgentOps dashboard
As well as normal instrumentation:
In [3]: agentops.init()
π AgentOps: Session Replay for default trace: https://app.agentops.ai/sessions?trace_id=24108c993e0f3b8b4d20b3e0b7e96c06
Out[3]: <agentops.legacy.Session at 0x1233f8cb0>
In [4]: import litellm
In [5]:
...: # Non-streaming response
...: response = litellm.responses(
...: model="openai/gpt-4o",
...: input="Tell me a three sentence bedtime story about a unicorn.",
...: max_output_tokens=100
...: )
...:
...: print(response)
TODO:
- Fix instrumentation
- Update AgentOps docs if required
- Raise a PR and update LiteLLM's AgentOps docs if required
π€ Contribution
- [ ] Yes, I'd be happy to submit a pull request with these changes.
- [ ] I need some guidance on how to contribute.
- [ ] I'd prefer the AgentOps team to handle this update.
The root cause is that the OpenAI Responses API uses different token field names (input_tokens, output_tokens) compared to the Chat Completion API (prompt_tokens, completion_tokens). Since weβre utilizing LiteLLM's OpenTelemetry span attribute extraction logic, the best approach would be to integrate LiteLLM properly.
Due to this, our exporter throws the following error:
(DEBUG) π AgentOps: [opentelemetry.exporter.otlp.proto.http.trace_exporter] Transient error Service Unavailable encountered while exporting span batch, retrying in 1s.
Might also help to make a separate span attribute that states whether the type is "completion" or a "response"