feat(openai): Add support for Responses.parse()
- Add wrapper for the Responses.parse and it's asynchronous variant
- Add test coverage
- [x] I have added tests that cover my changes.
- [x] If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
- [x] PR name follows conventional commits format:
feat(instrumentation): ...orfix(instrumentation): .... - [x] (If applicable) I have updated the documentation accordingly.
Screenshots of traces in Jaeger (content removed as I've tried this in a real project)
[!IMPORTANT] Adds support for
Responses.parse()in OpenAI instrumentation with synchronous and asynchronous wrappers, and comprehensive test coverage.
- Behavior:
- Adds
responses_parse_wrapperandasync_responses_parse_wrapperinresponses_wrappers.pyto handle structured outputs inResponses.parse().- Updates
_instrument()and_uninstrument()in__init__.pyto wrapResponses.parseandAsyncResponses.parse.- Tests:
- Adds tests in
test_responses_parse.pyforResponses.parse()covering basic, message history, moderation, tools, reasoning, exceptions, output fallback, instructions, token usage, and response ID scenarios.- Adds YAML files for VCR cassettes to test various
Responses.parse()scenarios.This description was created by
for 7c47e062dc70b8dd49b3af4e900bcc2cce292b22. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
-
New Features
- Enhanced OpenAI instrumentation to trace response parsing for both sync and async flows.
- Captures richer telemetry: system/user prompts, completions, structured outputs (with fallback), tool calls, reasoning details, token usage, and response IDs.
- Improved error reporting and proper cleanup when instrumentation is removed.
-
Tests
- Added comprehensive cassettes and test suite covering basic flows, message history, tools, moderation, reasoning, output fallback, token usage, response ID correlation, async variants, and error scenarios.
[!NOTE]
Other AI code review bot(s) detected
CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.
Walkthrough
Adds instrumentation to wrap OpenAI v1 Responses.parse (sync and async), implements new parse wrappers that capture structured outputs, prompts, tools, reasoning, and usage into spans, updates uninstrumentation to unwrap those methods, and introduces many VCR cassettes plus a comprehensive test suite covering sync/async, tools, reasoning, moderation, fallback, and error paths.
Changes
| Cohort / File(s) | Summary |
|---|---|
Instrumentation hookspackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py |
Imports parse wrappers and wraps Responses.parse and AsyncResponses.parse via _try_wrap; adds corresponding unwrap calls in _uninstrument. |
Response parse wrapperspackages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py |
Adds responses_parse_wrapper and async variants; starts spans, records exceptions and attributes (prompts, outputs, tools, reasoning, usage), serializes structured outputs, merges traced data, adds ResponseOutputMessageParamWithoutId type under RESPONSES_AVAILABLE, and adjusts set_data_attributes retrieval. |
Test cassettes (responses.parse)packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses_parse/* |
Adds multiple VCR cassettes for many scenarios (basic, async basic, message history, tools, moderation, reasoning, instructions, token usage, response id, output fallback) to exercise parsing and metadata. |
Testspackages/opentelemetry-instrumentation-openai/tests/traces/test_responses_parse.py |
New comprehensive test suite validating spans/attributes and behaviors for sync/async parsing, tools, moderation, reasoning, fallbacks, token usage, response ID, and error paths. |
Sequence Diagram(s)
sequenceDiagram
autonumber
participant App
participant SDK as OpenAI SDK
participant Wrapper as Parse Wrapper
participant API as OpenAI API
participant Tracer
App->>SDK: responses.parse(...) or await responses.parse(...)
SDK->>Wrapper: invoke wrapped parse
Wrapper->>Tracer: start span "openai.responses.parse"
Wrapper->>API: POST /v1/responses
API-->>Wrapper: response (id, output, usage, reasoning, tools)
Wrapper->>Wrapper: extract/serialize output_parsed or fallback output_text
Wrapper->>Tracer: set attributes (prompts, completion, tools, usage, reasoning, response.id)
Wrapper-->>SDK: return parsed result
SDK-->>App: parsed result
alt error
Wrapper->>Tracer: record exception & error attributes
Wrapper-->>SDK: re-raise error
end
note right of Wrapper: Async path mirrors sync with await points
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
- traceloop/openllmetry#3244 — Modifies OpenAI v1 responses_wrappers and parse instrumentation; likely overlaps with parse wrapper changes and attribute handling.
Suggested reviewers
- nirga
- dinmukhamedm
Poem
A rabbit taps keys with a hop and a cheer,
Wrapping parse calls so the traces appear.
Spans gather tokens, tools, and reasoned light,
Async or sync, they record day and night.
Carrots for tests, and traces tucked tight. 🥕✨
Pre-merge checks and finishing touches
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | ⚠️ Warning | Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
✅ Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title Check | ✅ Passed | The title “feat(openai): Add support for Responses.parse()” clearly summarizes the primary change of the pull request by indicating that support for the Responses.parse() method is being added within the OpenAI instrumentation. It follows conventional commit guidelines, is concise and specific, and directly reflects the core feature introduced in the changeset. A teammate scanning the history would immediately understand the main purpose of the PR from this title. |
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.