dd-trace-go icon indicating copy to clipboard operation
dd-trace-go copied to clipboard

[Draft; don't merge] v1 as frontend of v2

Open darccio opened this issue 3 months ago • 5 comments

What does this PR do?

Turns the current v1 a frontend of v2, wrapping the new API while keeping the original behaviour.

Motivation

Simplify internal testing and future migration to v2.

Reviewer's Checklist

  • [ ] Changed code has unit tests for its functionality at or near 100% coverage.
  • [ ] System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • [ ] There is a benchmark for any new code, or changes to existing code.
  • [ ] If this interacts with the agent in a new way, a system test has been added.
  • [ ] Add an appropriate team label so this PR gets put in the right place for the release notes.
  • [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

For Datadog employees:

  • [ ] If this PR touches code that handles credentials of any kind, such as Datadog API keys, I've requested a review from @DataDog/security-design-and-guidance.
  • [ ] This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

darccio avatar Mar 11 '24 20:03 darccio

Notes about benchmarks:

scenario:BenchmarkOTelApiWithCustomTags/datadog_otel_api-24

As reported in March 14:

  • 🟥 allocations [+3; +3] or [+10.000%; +10.000%]
  • 🟥 execution_time [+453.667ns; +514.533ns] or [+8.967%; +10.170%]

The extra allocations are:

  • ApplyV1Options added 2 allocations, one for the anonymous function and another for the v1 ddtrace.StartSpanConfig required. The latter is avoided using a pool. The same applies to ApplyV1FinishOptions.
  • No overhead from v2 detected.
  • The execution time overhead may be related to the use of pools (previously it was closer to +5%). Needs further investigation.

scenario:BenchmarkOTelApiWithCustomTags/otel_api-24

  • 🟥 execution_time [+198.903ns; +282.297ns] or [+2.669%; +3.789%]
  • 🟩 allocated_mem [-194 bytes; -184 bytes] or [-3.902%; -3.704%]

Execution time is within the acceptable range (below +5%). Reduced memory usage comes from the removal of interfaces in the API. This is consistent with v2 benchmarks.

scenario:BenchmarkStartRequestSpan-24

As reported in March 14:

  • 🟥 allocated_mem [+8 bytes; +8 bytes] or [+4.545%; +4.545%]
  • 🟥 allocations [+1; +1] or [+20.000%; +20.000%]
  • 🟥 execution_time [+19.684ns; +21.716ns] or [+4.938%; +5.447%]

The extra allocation is inevitable. It's the allocation for an anonymous function - returned by ApplyV1Options - that we need to process any v1 StartSpanOption in v2.

We also found that the current implementation of StartSpanConfig.Tags as map[string]interface{} causes a lot of allocations that can be avoided with a design that avoids interface{} as type.

The memory and execution time increases are acceptable, around +5%.

scenario:BenchmarkHttpServeTrace-24

🟥 allocated_mem [+4.671KB; +4.698KB] or [+94.121%; +94.667%] 🟥 allocations [+33; +33] or [+67.347%; +67.347%] 🟥 execution_time [+7.616µs; +7.710µs] or [+73.904%; +74.817%]

main was using the mocktracer, while v2 uses the real tracer. To avoid measuring different things, we are updating main in this PR.

The rest of benchmarks

After verifying everything works as expected, we conclude that the tracer default config in test is affecting the output of the tests. As this branch has specific details to its tests, we can assume that the real v2 performance is very similar to v1. This is what early benchmarks show in v2-dev.

darccio avatar Mar 12 '24 14:03 darccio

Benchmarks

Benchmark execution time: 2024-03-20 12:43:36

Comparing candidate commit 390dd0b103a9c93db3b6ff157c9dec808eb011eb in PR branch vits with baseline commit 485e60e44870efa9432dc0fb016942341cd5d342 in branch main.

Found 24 performance improvements and 7 performance regressions! Performance is the same for 10 metrics, 0 unstable metrics.

scenario:BenchmarkHttpServeTrace-24

  • 🟥 allocations [+3; +3] or [+3.797%; +3.797%]
  • 🟥 execution_time [+1.510µs; +1.730µs] or [+9.152%; +10.485%]

scenario:BenchmarkOTelApiWithCustomTags/datadog_otel_api-24

  • 🟥 allocations [+3; +3] or [+10.000%; +10.000%]
  • 🟥 execution_time [+740.456ns; +842.344ns] or [+15.730%; +17.895%]

scenario:BenchmarkOTelApiWithCustomTags/otel_api-24

  • 🟩 allocated_mem [-202 bytes; -188 bytes] or [-4.059%; -3.776%]

scenario:BenchmarkPartialFlushing/Disabled-24

  • 🟩 allocated_mem [-302.496MB; -302.413MB] or [-96.688%; -96.662%]
  • 🟩 allocations [-2674157; -2672761] or [-84.265%; -84.221%]
  • 🟩 avgHeapInUse(Mb) [-45.281MB; -39.629MB] or [-69.704%; -61.003%]
  • 🟩 execution_time [-252.199ms; -250.646ms] or [-91.502%; -90.939%]

scenario:BenchmarkPartialFlushing/Enabled-24

  • 🟩 allocated_mem [-400.177MB; -397.725MB] or [-97.758%; -97.159%]
  • 🟩 allocations [-2683410; -2683401] or [-84.292%; -84.292%]
  • 🟩 avgHeapInUse(Mb) [-43.806MB; -41.704MB] or [-70.147%; -66.781%]
  • 🟩 execution_time [-249.037ms; -246.381ms] or [-91.599%; -90.622%]

scenario:BenchmarkSingleSpanRetention/no-rules-24

  • 🟩 allocated_mem [-151.416KB; -151.415KB] or [-93.531%; -93.531%]
  • 🟩 allocations [-922; -922] or [-64.702%; -64.702%]
  • 🟩 execution_time [-213.623µs; -212.891µs] or [-90.413%; -90.103%]

scenario:BenchmarkSingleSpanRetention/with-rules/match-all-24

  • 🟩 allocated_mem [-153.781KB; -153.705KB] or [-93.646%; -93.600%]
  • 🟩 allocations [-922; -922] or [-64.702%; -64.702%]
  • 🟩 execution_time [-214.102µs; -213.806µs] or [-90.118%; -89.993%]

scenario:BenchmarkSingleSpanRetention/with-rules/match-half-24

  • 🟩 allocated_mem [-153.765KB; -153.703KB] or [-93.642%; -93.604%]
  • 🟩 allocations [-922; -922] or [-64.702%; -64.702%]
  • 🟩 execution_time [-214.193µs; -213.736µs] or [-90.143%; -89.951%]

scenario:BenchmarkStartRequestSpan-24

  • 🟥 allocated_mem [+8 bytes; +8 bytes] or [+4.545%; +4.545%]
  • 🟥 allocations [+1; +1] or [+20.000%; +20.000%]
  • 🟥 execution_time [+39.791ns; +41.009ns] or [+10.842%; +11.174%]

scenario:BenchmarkStartSpan-24

  • 🟩 allocated_mem [-1.524KB; -1.524KB] or [-94.541%; -94.541%]
  • 🟩 allocations [-11; -11] or [-68.750%; -68.750%]
  • 🟩 execution_time [-2.035µs; -2.025µs] or [-89.138%; -88.724%]

scenario:BenchmarkTracerAddSpans-24

  • 🟩 allocated_mem [-2.040KB; -2.040KB] or [-90.426%; -90.426%]
  • 🟩 allocations [-17; -17] or [-68.000%; -68.000%]
  • 🟩 execution_time [-3.550µs; -3.521µs] or [-90.363%; -89.621%]

pr-commenter[bot] avatar Mar 19 '24 10:03 pr-commenter[bot]

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Apr 15 '24 02:04 github-actions[bot]