dd-trace-py
dd-trace-py copied to clipboard
perf: reduce Span._finish_ns overhead in tracer and profiling
- Remove Hooks class usage from BaseContextProvider and Tracer
- Replace hook-based callbacks with core event dispatcher system
- Optimize _update_active logic to reduce redundant finished() checks
- Cache span._local_root property lookup in profiling link_span
- Streamline span activation flow in DefaultContextProvider
- Convert SimpleMovingAverage to Cython for C-speed mathematical operations
- Add encoder caching to reduce redundant _trace_id_64bits property lookups
Testing against a locally running version of PyPA warehouse with a single gunicorn worker with tracing and profiling enabled we saw an improvement of:
- a 4.9% improvement in requests per second (20.08 → 21.06 RPS)
- 4.6% reduction in response time (49.79ms → 47.48ms)
- elimination of 44,028 function calls (-7.2%)
Checklist
- [ ] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the library release note guidelines
- The change includes or references documentation updates if necessary
- Backport labels are set (if applicable)
Reviewer Checklist
- [ ] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking API changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the release branch maintenance policy
CODEOWNERS have been resolved as:
ddtrace/internal/sma.pyi @DataDog/apm-core-python
ddtrace/internal/sma.pyx @DataDog/apm-core-python
src/native/event_hub.rs @DataDog/apm-core-python
.gitignore @DataDog/apm-core-python
ddtrace/_trace/provider.py @DataDog/apm-sdk-api-python
ddtrace/_trace/tracer.py @DataDog/apm-sdk-api-python
ddtrace/internal/_encoding.pyx @DataDog/apm-core-python
ddtrace/internal/core/__init__.py @DataDog/apm-core-python
ddtrace/internal/core/event_hub.py @DataDog/apm-core-python
ddtrace/internal/datadog/profiling/stack_v2/__init__.py @DataDog/profiling-python
ddtrace/internal/native/_native.pyi @DataDog/apm-core-python
ddtrace/internal/writer/writer.py @DataDog/apm-core-python
ddtrace/profiling/collector/stack.pyx @DataDog/profiling-python
pyproject.toml @DataDog/python-guild
setup.py @DataDog/python-guild
src/native/lib.rs @DataDog/apm-core-python
tests/suitespec.yml @DataDog/python-guild @DataDog/apm-core-python
Bootstrap import analysis
Comparison of import times between this PR and base.
Summary
The average import time from this PR is: 273 ± 3 ms.
The average import time from base is: 275 ± 2 ms.
The import time difference between this PR and base is: -2.5 ± 0.1 ms.
Import time breakdown
The following import paths have grown:
ddtrace.auto
2.501 ms
(0.92%)
ddtrace.bootstrap.sitecustomize
1.891 ms
(0.69%)
ddtrace.bootstrap.preload
1.177 ms
(0.43%)
ddtrace.internal.products
0.947 ms
(0.35%)
importlib.metadata
0.947 ms
(0.35%)
importlib.metadata._meta
0.660 ms
(0.24%)
importlib.abc
0.083 ms
(0.03%)
importlib.resources
0.083 ms
(0.03%)
importlib.resources._common
0.083 ms
(0.03%)
zipfile
0.072 ms
(0.03%)
ddtrace.settings.profiling
0.142 ms
(0.05%)
ddtrace.vendor.psutil
0.073 ms
(0.03%)
ddtrace.vendor.psutil._pslinux
0.073 ms
(0.03%)
ddtrace.vendor.psutil._psposix
0.073 ms
(0.03%)
ddtrace.internal.datadog.profiling.stack_v2
0.070 ms
(0.03%)
ddtrace.internal.datadog.profiling.stack_v2._stack_v2
0.070 ms
(0.03%)
ddtrace.internal.runtime.runtime_metrics
0.088 ms
(0.03%)
ddtrace.internal.runtime.metric_collectors
0.088 ms
(0.03%)
ddtrace.appsec._common_module_patches
0.663 ms
(0.24%)
ddtrace.appsec._asm_request_context
0.663 ms
(0.24%)
ddtrace._trace.trace_handlers
0.051 ms
(0.02%)
ddtrace
0.610 ms
(0.22%)
ddtrace.trace
0.552 ms
(0.20%)
ddtrace._trace.filters
0.445 ms
(0.16%)
ddtrace._trace.processor
0.445 ms
(0.16%)
ddtrace.internal.writer
0.138 ms
(0.05%)
ddtrace.internal.writer.writer
0.138 ms
(0.05%)
gzip
0.081 ms
(0.03%)
ddtrace.internal.sma
0.057 ms
(0.02%)
ddtrace.internal.dogstatsd
0.104 ms
(0.04%)
ddtrace.vendor.dogstatsd
0.104 ms
(0.04%)
ddtrace.vendor.dogstatsd.base
0.104 ms
(0.04%)
queue
0.104 ms
(0.04%)
ddtrace._trace.sampler
0.088 ms
(0.03%)
ddtrace._trace.span
0.088 ms
(0.03%)
ddtrace.internal.sampling
0.088 ms
(0.03%)
ddtrace._trace.sampling_rule
0.088 ms
(0.03%)
ddtrace._trace.tracer
0.106 ms
(0.04%)
ddtrace.internal.debug
0.106 ms
(0.04%)
ddtrace._monkey
0.058 ms
(0.02%)
ddtrace.settings.asm
0.058 ms
(0.02%)
ddtrace.appsec._constants
0.058 ms
(0.02%)
The following import paths have shrunk:
ddtrace.auto
5.290 ms
(1.94%)
ddtrace.bootstrap.sitecustomize
2.993 ms
(1.10%)
ddtrace.bootstrap.preload
2.330 ms
(0.85%)
ddtrace.internal.products
0.879 ms
(0.32%)
importlib.metadata
0.879 ms
(0.32%)
zipfile
0.632 ms
(0.23%)
zipfile._path
0.632 ms
(0.23%)
csv
0.104 ms
(0.04%)
_csv
0.104 ms
(0.04%)
importlib.abc
0.073 ms
(0.03%)
importlib.resources
0.073 ms
(0.03%)
importlib.resources._legacy
0.073 ms
(0.03%)
importlib.metadata._itertools
0.070 ms
(0.03%)
ddtrace.internal.remoteconfig.client
0.619 ms
(0.23%)
ddtrace.internal.runtime.runtime_metrics
0.089 ms
(0.03%)
ddtrace.internal.runtime.tag_collectors
0.089 ms
(0.03%)
ddtrace.settings.profiling
0.054 ms
(0.02%)
ddtrace.internal.datadog.profiling.ddup
0.054 ms
(0.02%)
ddtrace.internal.datadog.profiling.ddup._ddup
0.054 ms
(0.02%)
multiprocessing.sharedctypes
0.043 ms
(0.02%)
multiprocessing.heap
0.043 ms
(0.02%)
mmap
0.043 ms
(0.02%)
ddtrace.appsec._common_module_patches
0.663 ms
(0.24%)
ddtrace.appsec._asm_request_context
0.663 ms
(0.24%)
ddtrace.appsec._utils
0.663 ms
(0.24%)
ddtrace
2.298 ms
(0.84%)
ddtrace._monkey
1.079 ms
(0.40%)
ddtrace.appsec._listeners
0.965 ms
(0.35%)
ddtrace.internal.core
0.965 ms
(0.35%)
ddtrace.internal.core.event_hub
0.880 ms
(0.32%)
ddtrace.vendor.packaging.specifiers
0.114 ms
(0.04%)
ddtrace.trace
0.518 ms
(0.19%)
ddtrace._trace.filters
0.382 ms
(0.14%)
ddtrace._trace.processor
0.382 ms
(0.14%)
ddtrace._trace.sampler
0.204 ms
(0.07%)
ddtrace._trace.span
0.204 ms
(0.07%)
pprint
0.121 ms
(0.04%)
ddtrace.internal.dogstatsd
0.100 ms
(0.04%)
ddtrace.vendor.dogstatsd
0.100 ms
(0.04%)
ddtrace.vendor.dogstatsd.base
0.100 ms
(0.04%)
ddtrace.vendor.dogstatsd.container
0.100 ms
(0.04%)
ddtrace.internal.writer
0.078 ms
(0.03%)
ddtrace.internal.writer.writer
0.078 ms
(0.03%)
ddtrace._trace.tracer
0.099 ms
(0.04%)
ddtrace.settings.peer_service
0.099 ms
(0.04%)
ddtrace._trace.provider
0.037 ms
(0.01%)
ddtrace.internal._unpatched
0.056 ms
(0.02%)
json
0.030 ms
(0.01%)
json.decoder
0.030 ms
(0.01%)
re
0.030 ms
(0.01%)
enum
0.030 ms
(0.01%)
types
0.030 ms
(0.01%)
subprocess
0.026 ms
(0.01%)
contextlib
0.026 ms
(0.01%)
Benchmarks
Benchmark execution time: 2025-06-20 20:32:31
Comparing candidate commit 50dc57cef55a70e987a24c01569aee999a9227e9 in PR branch brettlangdon/perf.optimize.span.finish with baseline commit 7a556744cd4dfc85a0284f486c73212bf7c633cb in branch main.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 561 metrics, 3 unstable metrics.
addressed these in individual PRs