feat(llmobs): track prompt caching for openai chat completions
Tracks number of tokens read from the prompt cache for openai chat completions
openai does prompt caching by default and returns a cached_tokens field in prompt_tokens_details
https://platform.openai.com/docs/api-reference/chat/create
We rely on two keys in metrics for prompt caching:
-
cache_read_input_tokens -
cache_write_input_tokens
We have both of these fields since bedrock/anthropic return info on cache read/writes
cached_tokens maps to cache_read_input_tokens
Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the library release note guidelines
- The change includes or references documentation updates if necessary
- Backport labels are set (if applicable)
Reviewer Checklist
- [ ] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking API changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the release branch maintenance policy
CODEOWNERS have been resolved as:
releasenotes/notes/oai-p-cache-78c511f97709a357.yaml @DataDog/apm-python
tests/contrib/openai/cassettes/v1/chat_completion_prompt_caching_cache_read.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/chat_completion_prompt_caching_cache_write.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/chat_completion_stream_prompt_caching_cache_read.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/chat_completion_stream_prompt_caching_cache_write.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_prompt_caching_cache_read.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_prompt_caching_cache_write.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_stream_prompt_caching_cache_read.yaml @DataDog/ml-observability
tests/contrib/openai/cassettes/v1/responses_stream_prompt_caching_cache_write.yaml @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai.py @DataDog/ml-observability
tests/contrib/openai/test_openai_llmobs.py @DataDog/ml-observability
Bootstrap import analysis
Comparison of import times between this PR and base.
Summary
The average import time from this PR is: 275 ± 2 ms.
The average import time from base is: 277 ± 2 ms.
The import time difference between this PR and base is: -1.95 ± 0.08 ms.
Import time breakdown
The following import paths have shrunk:
ddtrace.auto
1.974 ms
(0.72%)
ddtrace.bootstrap.sitecustomize
1.299 ms
(0.47%)
ddtrace.bootstrap.preload
1.299 ms
(0.47%)
ddtrace.internal.remoteconfig.client
0.656 ms
(0.24%)
ddtrace
0.675 ms
(0.25%)
ddtrace.internal._unpatched
0.032 ms
(0.01%)
json
0.032 ms
(0.01%)
json.decoder
0.032 ms
(0.01%)
re
0.032 ms
(0.01%)
enum
0.032 ms
(0.01%)
types
0.032 ms
(0.01%)
Benchmarks
Benchmark execution time: 2025-07-09 15:29:00
Comparing candidate commit 749f06acbb0979b8f1b2455e7c813b64ef4782b6 in PR branch evan.li/openai-prompt-caching with baseline commit 573a530416a3ed272704668fb99e74205cdfa91f in branch main.
Found 0 performance improvements and 1 performance regressions! Performance is the same for 523 metrics, 2 unstable metrics.
scenario:iastaspectsospath-ospathnormcase_aspect
- 🟥
execution_time[+398.337ns; +474.077ns] or [+11.400%; +13.567%]