dd-trace-py
dd-trace-py copied to clipboard
fix(llmobs): deprecate and convert numerical metrics to score type
LLM Obs backend currently does not support ingesting the numerical metric type, so the SDK needs to be updated to
- warn users not to submit this metric type and also
- submit any
numericalmetric types as a supportedscoremetric type for users who already started submitting evaluation metrics with thenumericaltype.
So we still support users using submit_evaluation with the 'numerical' type, under the hood it will just be converted to score type.
Checklist
- [x] Change(s) are motivated and described in the PR description
- [x] Testing strategy is described if automated tests are not included in the PR
- [x] Risks are described (performance impact, potential for breakage, maintainability)
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Library release note guidelines are followed or label
changelog/no-changelogis set - [x] Documentation is included (in-code, generated user docs, public corp docs)
- [x] Backport labels are set (if applicable)
- [x] If this PR changes the public interface, I've notified
@DataDog/apm-tees.
Reviewer Checklist
- [x] Title is accurate
- [x] All changes are related to the pull request's stated goal
- [x] Description motivates each change
- [x] Avoids breaking API changes
- [x] Testing strategy adequately addresses listed risks
- [x] Change is maintainable (easy to change, telemetry, documentation)
- [x] Release note makes sense to a user of the library
- [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
- [x] Backport labels are set in a manner that is consistent with the release branch maintenance policy
Datadog Report
Branch report: evan.li/remove-numeric-supp
Commit report: a30aacb
Test service: dd-trace-py
:white_check_mark: 0 Failed, 774 Passed, 39698 Skipped, 33m 56.92s Total duration (41m 56.42s time saved)
Codecov Report
Attention: Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.
Project coverage is 27.06%. Comparing base (
deadfcd) to head (70c625d). Report is 41 commits behind head on main.
| Files | Patch % | Lines |
|---|---|---|
| ddtrace/llmobs/_llmobs.py | 0.00% | 4 Missing :warning: |
| tests/llmobs/test_llmobs_service.py | 0.00% | 3 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #9658 +/- ##
===========================================
- Coverage 75.61% 27.06% -48.55%
===========================================
Files 1336 1365 +29
Lines 125991 127491 +1500
===========================================
- Hits 95271 34511 -60760
- Misses 30720 92980 +62260
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Benchmarks
Benchmark execution time: 2024-07-05 15:04:00
Comparing candidate commit f34d219a87660d23cf752933e6c990a78595f3b3 in PR branch evan.li/remove-numeric-supp with baseline commit f9edeed4205d6ab854aea3d8d23b0cba26f7714d in branch main.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 221 metrics, 9 unstable metrics.