dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

fix(tracing): only extract distributed headers if a trace is not already started

Open brettlangdon opened this issue 1 year ago • 6 comments

Today there are some integrations which always check for distributed tracing headers and if found will activate them as the current/parent context. This blind approach does not take into account if a parent trace has already been started.

This means if you have nested Flask apps (for example), then the child Flask app may make it's parent the upstream remote service, and not the first/initial Flask app in the process. This can result in a broken trace.

The scenario which "works" today which will be broken with this change is if there is "broken" instrumentation as the first span in a trace. This could either be custom instrumentation that did not check for distributed headers, or a built-in integration which is not properly parsing/activating distributed tracing headers.

This change would prioritize keeping the local trace accurate over ensuring a connection with the upstream calling service.

Checklist

  • [x] Change(s) are motivated and described in the PR description
  • [x] Testing strategy is described if automated tests are not included in the PR
  • [x] Risks are described (performance impact, potential for breakage, maintainability)
  • [x] Change is maintainable (easy to change, telemetry, documentation)
  • [x] Library release note guidelines are followed or label changelog/no-changelog is set
  • [x] Documentation is included (in-code, generated user docs, public corp docs)
  • [x] Backport labels are set (if applicable)
  • [x] If this PR changes the public interface, I've notified @DataDog/apm-tees.

Reviewer Checklist

  • [x] Title is accurate
  • [x] All changes are related to the pull request's stated goal
  • [x] Description motivates each change
  • [x] Avoids breaking API changes
  • [x] Testing strategy adequately addresses listed risks
  • [x] Change is maintainable (easy to change, telemetry, documentation)
  • [x] Release note makes sense to a user of the library
  • [x] Author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • [x] Backport labels are set in a manner that is consistent with the release branch maintenance policy

brettlangdon avatar Jun 03 '24 11:06 brettlangdon

Datadog Report

Branch report: brettlangdon/fix.extra.distributed.extract Commit report: e7ca717 Test service: dd-trace-py

:white_check_mark: 0 Failed, 120670 Passed, 56233 Skipped, 4h 17m 33.35s Total duration (5h 58m 24.97s time saved)

Are we going to backport this to 2.8 and 2.9 (after release)? Or are we introducing it in 2.10 since it would break some cases that currently work?

I was thinking just 2.10. I do consider this a fix for a broken scenario, but not urgent or pressing enough to consider backporting it.

brettlangdon avatar Jun 06 '24 15:06 brettlangdon

Benchmarks

Benchmark execution time: 2025-01-17 21:12:15

Comparing candidate commit 4764abcd929c21ff434a7936b138db926c3ccd1b in PR branch brettlangdon/fix.extra.distributed.extract with baseline commit ef4c99759f8c1d87642718db2292ae98e073b3c6 in branch main.

Found 3 performance improvements and 0 performance regressions! Performance is the same for 391 metrics, 2 unstable metrics.

scenario:flasksqli-appsec-enabled

  • 🟩 execution_time [-172.187µs; -162.940µs] or [-7.411%; -7.013%]

scenario:flasksqli-tracer-enabled

  • 🟩 execution_time [-182.098µs; -172.750µs] or [-7.809%; -7.408%]

scenario:iast_aspects-ospathnormcase_aspect

  • 🟩 execution_time [-247.232ns; -202.048ns] or [-9.003%; -7.358%]

pr-commenter[bot] avatar Jun 06 '24 16:06 pr-commenter[bot]

Codecov Report

Attention: Patch coverage is 7.69231% with 48 lines in your changes missing coverage. Please review.

Project coverage is 10.42%. Comparing base (c7e92f6) to head (c8b707e). Report is 1 commits behind head on main.

Files Patch % Lines
tests/contrib/wsgi/test_wsgi.py 0.00% 47 Missing :warning:
ddtrace/settings/config.py 0.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #9456       +/-   ##
===========================================
- Coverage   74.30%   10.42%   -63.88%     
===========================================
  Files        1398     1365       -33     
  Lines      129873   127713     -2160     
===========================================
- Hits        96500    13320    -83180     
- Misses      33373   114393    +81020     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jul 11 '24 19:07 codecov-commenter

Datadog Report

Branch report: brettlangdon/fix.extra.distributed.extract Commit report: 4764abc Test service: dd-trace-py

:white_check_mark: 0 Failed, 1048 Passed, 550 Skipped, 16m 42.26s Total duration (23m 9.64s time saved)

CODEOWNERS have been resolved as:

releasenotes/notes/fix-unncessary-header-extraction-c1facf2b30331afb.yaml  @DataDog/apm-python
ddtrace/contrib/trace_utils.py                                          @DataDog/apm-core-python @DataDog/apm-idm-python
ddtrace/settings/config.py                                              @DataDog/python-guild @DataDog/apm-sdk-api-python
tests/contrib/wsgi/test_wsgi.py                                         @DataDog/apm-core-python @DataDog/apm-idm-python
tests/utils.py                                                          @DataDog/python-guild

github-actions[bot] avatar Sep 16 '24 14:09 github-actions[bot]