dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

chore(llmobs): dac strip io from OpenAI

Open jsimpher opened this issue 5 months ago • 3 comments

Remove potentially sensitive i/o data from apm spans. This way, prompt and completion data will only appear on the llm obs spans, which are/will be subject to data access controls.

Mostly, this just removes io tag sets. A few things (mostly metrics) have llmobs tags dependent on span tags, so there is a bit more refactoring there.

Let me know if I removed anything that should really stay, or if I missed something that should be restricted.

This one does a lot that the others don't. I've left things like audio transcript and image/file retrieval that we don't duplicate.

Checklist

  • [ ] PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • [ ] Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

jsimpher avatar Jun 26 '25 18:06 jsimpher

CODEOWNERS have been resolved as:

releasenotes/notes/remove-io-data-from-apm-span-openai-integration-81f3ae914a5d2faf.yaml  @DataDog/apm-python
ddtrace/contrib/internal/openai/_endpoint_hooks.py                      @DataDog/ml-observability
ddtrace/contrib/internal/openai/patch.py                                @DataDog/ml-observability
ddtrace/contrib/internal/openai/utils.py                                @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai.py                                  @DataDog/ml-observability
ddtrace/llmobs/_integrations/utils.py                                   @DataDog/ml-observability
ddtrace/llmobs/_utils.py                                                @DataDog/ml-observability
tests/contrib/openai/test_openai_llmobs.py                              @DataDog/ml-observability
tests/contrib/openai/test_openai_v1.py                                  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_acompletion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_azure_openai_chat_completion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_azure_openai_completion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_azure_openai_embedding.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion_function_calling.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion_image_input.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_chat_completion_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_completion.json   @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_completion_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_create_moderation.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding.json    @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding_array_of_token_arrays.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding_string_array.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_embedding_token_array.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_create.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_delete.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_download.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_list.json    @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_file_retrieve.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_image_b64_json_response.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_image_create.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_misuse.json       @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_model_delete.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_model_list.json   @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_model_retrieve.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response.json     @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_error.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_tools.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_response_tools_stream.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai.test_span_finish_on_stream_error.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_completion_stream_est_tokens.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_empty_streamed_chat_completion_resp_returns.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_empty_streamed_completion_resp_returns.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_empty_streamed_response_resp_returns.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_async.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[None-None].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[None-v0].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[None-v1].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[mysvc-None].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[mysvc-v0].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_service_name[mysvc-v1].json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai.test_openai_v1.test_integration_sync.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents_streaming.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents_sync.json  @DataDog/ml-observability
tests/snapshots/tests.contrib.openai_agents.test_openai_agents.test_openai_agents_with_tool_error.json  @DataDog/ml-observability

github-actions[bot] avatar Jun 26 '25 18:06 github-actions[bot]

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 278 ± 3 ms.

The average import time from base is: 283 ± 7 ms.

The import time difference between this PR and base is: -4.8 ± 0.2 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 2.205 ms (0.79%)
ddtrace.bootstrap.sitecustomize 1.520 ms (0.55%)
ddtrace.bootstrap.preload 1.520 ms (0.55%)
ddtrace.internal.remoteconfig.client 0.703 ms (0.25%)
ddtrace 0.685 ms (0.25%)
ddtrace.internal._unpatched 0.034 ms (0.01%)
json 0.034 ms (0.01%)
json.decoder 0.034 ms (0.01%)
re 0.034 ms (0.01%)
enum 0.034 ms (0.01%)
types 0.034 ms (0.01%)

github-actions[bot] avatar Jun 26 '25 18:06 github-actions[bot]

Benchmarks

Benchmark execution time: 2025-07-15 19:37:22

Comparing candidate commit 94ab642ef45f55065fb35ee8338f0670d9ea3d90 in PR branch jsimpher/dac-strip-io-from-openai with baseline commit c81e59493b4d3d505795985ab898c01de9d03595 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 508 metrics, 2 unstable metrics.

pr-commenter[bot] avatar Jun 26 '25 19:06 pr-commenter[bot]

/merge

jsimpher avatar Jul 17 '25 13:07 jsimpher

View all feedbacks in Devflow UI.

2025-07-17 13:14:55 UTC :information_source: Start processing command /merge


2025-07-17 13:15:15 UTC :information_source: MergeQueue: waiting for PR to be ready

This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals. Note: if you pushed new commits since the last approval, you may need additional approval. You can remove it from the waiting list with /remove command.


2025-07-17 13:17:25 UTC :information_source: MergeQueue: merge request added to the queue

The expected merge time in main is approximately 2h (p90).


2025-07-17 15:03:46 UTC :information_source: MergeQueue: This merge request was merged