dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

feat(llmobs): track prompt caching for openai chat completions

Open lievan opened this issue 9 months ago • 3 comments

Tracks number of tokens read from the prompt cache for openai chat completions

openai does prompt caching by default and returns a cached_tokens field in prompt_tokens_details https://platform.openai.com/docs/api-reference/chat/create

We rely on two keys in metrics for prompt caching:

  • cache_read_input_tokens
  • cache_write_input_tokens

We have both of these fields since bedrock/anthropic return info on cache read/writes

cached_tokens maps to cache_read_input_tokens

Checklist

  • [x] PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • [ ] Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

lievan avatar Jun 24 '25 16:06 lievan

CODEOWNERS have been resolved as:

releasenotes/notes/oai-p-cache-78c511f97709a357.yaml                    @DataDog/apm-python
tests/contrib/openai/cassettes/v1/chat_completion_prompt_caching_cache_read.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/chat_completion_prompt_caching_cache_write.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/chat_completion_stream_prompt_caching_cache_read.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/chat_completion_stream_prompt_caching_cache_write.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_prompt_caching_cache_read.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_prompt_caching_cache_write.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_stream_prompt_caching_cache_read.yaml  @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_stream_prompt_caching_cache_write.yaml  @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai.py                                  @DataDog/ml-observability
tests/contrib/openai/test_openai_llmobs.py                              @DataDog/ml-observability

github-actions[bot] avatar Jun 24 '25 16:06 github-actions[bot]

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 275 ± 2 ms.

The average import time from base is: 277 ± 2 ms.

The import time difference between this PR and base is: -1.95 ± 0.08 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 1.974 ms (0.72%)
ddtrace.bootstrap.sitecustomize 1.299 ms (0.47%)
ddtrace.bootstrap.preload 1.299 ms (0.47%)
ddtrace.internal.remoteconfig.client 0.656 ms (0.24%)
ddtrace 0.675 ms (0.25%)
ddtrace.internal._unpatched 0.032 ms (0.01%)
json 0.032 ms (0.01%)
json.decoder 0.032 ms (0.01%)
re 0.032 ms (0.01%)
enum 0.032 ms (0.01%)
types 0.032 ms (0.01%)

github-actions[bot] avatar Jun 24 '25 16:06 github-actions[bot]

Benchmarks

Benchmark execution time: 2025-07-09 15:29:00

Comparing candidate commit 749f06acbb0979b8f1b2455e7c813b64ef4782b6 in PR branch evan.li/openai-prompt-caching with baseline commit 573a530416a3ed272704668fb99e74205cdfa91f in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 523 metrics, 2 unstable metrics.

scenario:iastaspectsospath-ospathnormcase_aspect

  • 🟥 execution_time [+398.337ns; +474.077ns] or [+11.400%; +13.567%]

pr-commenter[bot] avatar Jun 24 '25 17:06 pr-commenter[bot]