openllmetry icon indicating copy to clipboard operation
openllmetry copied to clipboard

feat(openai): Add support for Responses.parse()

Open mgzb opened this issue 3 months ago • 2 comments

  • Add wrapper for the Responses.parse and it's asynchronous variant
  • Add test coverage
  • [x] I have added tests that cover my changes.
  • [x] If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
  • [x] PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • [x] (If applicable) I have updated the documentation accordingly.

Screenshots of traces in Jaeger (content removed as I've tried this in a real project)

Screenshot 2025-10-04 at 01 02 42 Screenshot 2025-10-04 at 01 04 50

[!IMPORTANT] Adds support for Responses.parse() in OpenAI instrumentation with synchronous and asynchronous wrappers, and comprehensive test coverage.

  • Behavior:
    • Adds responses_parse_wrapper and async_responses_parse_wrapper in responses_wrappers.py to handle structured outputs in Responses.parse().
    • Updates _instrument() and _uninstrument() in __init__.py to wrap Responses.parse and AsyncResponses.parse.
  • Tests:
    • Adds tests in test_responses_parse.py for Responses.parse() covering basic, message history, moderation, tools, reasoning, exceptions, output fallback, instructions, token usage, and response ID scenarios.
    • Adds YAML files for VCR cassettes to test various Responses.parse() scenarios.

This description was created by Ellipsis for 7c47e062dc70b8dd49b3af4e900bcc2cce292b22. You can customize this summary. It will automatically update as commits are pushed.

Summary by CodeRabbit

  • New Features

    • Enhanced OpenAI instrumentation to trace response parsing for both sync and async flows.
    • Captures richer telemetry: system/user prompts, completions, structured outputs (with fallback), tool calls, reasoning details, token usage, and response IDs.
    • Improved error reporting and proper cleanup when instrumentation is removed.
  • Tests

    • Added comprehensive cassettes and test suite covering basic flows, message history, tools, moderation, reasoning, output fallback, token usage, response ID correlation, async variants, and error scenarios.

mgzb avatar Oct 04 '25 00:10 mgzb

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Oct 04 '25 00:10 CLAassistant

[!NOTE]

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds instrumentation to wrap OpenAI v1 Responses.parse (sync and async), implements new parse wrappers that capture structured outputs, prompts, tools, reasoning, and usage into spans, updates uninstrumentation to unwrap those methods, and introduces many VCR cassettes plus a comprehensive test suite covering sync/async, tools, reasoning, moderation, fallback, and error paths.

Changes

Cohort / File(s) Summary
Instrumentation hooks
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py
Imports parse wrappers and wraps Responses.parse and AsyncResponses.parse via _try_wrap; adds corresponding unwrap calls in _uninstrument.
Response parse wrappers
packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py
Adds responses_parse_wrapper and async variants; starts spans, records exceptions and attributes (prompts, outputs, tools, reasoning, usage), serializes structured outputs, merges traced data, adds ResponseOutputMessageParamWithoutId type under RESPONSES_AVAILABLE, and adjusts set_data_attributes retrieval.
Test cassettes (responses.parse)
packages/opentelemetry-instrumentation-openai/tests/traces/cassettes/test_responses_parse/*
Adds multiple VCR cassettes for many scenarios (basic, async basic, message history, tools, moderation, reasoning, instructions, token usage, response id, output fallback) to exercise parsing and metadata.
Tests
packages/opentelemetry-instrumentation-openai/tests/traces/test_responses_parse.py
New comprehensive test suite validating spans/attributes and behaviors for sync/async parsing, tools, moderation, reasoning, fallbacks, token usage, response ID, and error paths.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant App
  participant SDK as OpenAI SDK
  participant Wrapper as Parse Wrapper
  participant API as OpenAI API
  participant Tracer

  App->>SDK: responses.parse(...) or await responses.parse(...)
  SDK->>Wrapper: invoke wrapped parse
  Wrapper->>Tracer: start span "openai.responses.parse"
  Wrapper->>API: POST /v1/responses
  API-->>Wrapper: response (id, output, usage, reasoning, tools)
  Wrapper->>Wrapper: extract/serialize output_parsed or fallback output_text
  Wrapper->>Tracer: set attributes (prompts, completion, tools, usage, reasoning, response.id)
  Wrapper-->>SDK: return parsed result
  SDK-->>App: parsed result

  alt error
    Wrapper->>Tracer: record exception & error attributes
    Wrapper-->>SDK: re-raise error
  end

  note right of Wrapper: Async path mirrors sync with await points

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • traceloop/openllmetry#3244 — Modifies OpenAI v1 responses_wrappers and parse instrumentation; likely overlaps with parse wrapper changes and attribute handling.

Suggested reviewers

  • nirga
  • dinmukhamedm

Poem

A rabbit taps keys with a hop and a cheer,
Wrapping parse calls so the traces appear.
Spans gather tokens, tools, and reasoned light,
Async or sync, they record day and night.
Carrots for tests, and traces tucked tight. 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title “feat(openai): Add support for Responses.parse()” clearly summarizes the primary change of the pull request by indicating that support for the Responses.parse() method is being added within the OpenAI instrumentation. It follows conventional commit guidelines, is concise and specific, and directly reflects the core feature introduced in the changeset. A teammate scanning the history would immediately understand the main purpose of the PR from this title.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Oct 04 '25 00:10 coderabbitai[bot]