dd-trace-dotnet icon indicating copy to clipboard operation
dd-trace-dotnet copied to clipboard

[Profiler] Make timer_create-based CPU profiler default

Open gleocadie opened this issue 10 months ago • 5 comments

Summary of changes

This is another try at making the timer_create-based CPU profiler the default one. (previous PR https://github.com/DataDog/dd-trace-dotnet/pull/6315)

Reason for change

We had to revert the previous PR because pthread_exit is explicitly called by the CLR, which starts cleaning up the thread thread_locals. But at the same time, the CPU timer_create-base kicks in and tries using, indirectly, those thread_locals.

This situation lead to crashes. In this PR, we wrap pthread_create to remove the cpu timer on the thread before continuing with pthread_exit

Implementation details

  • Wrap pthread_exit to call __dd_on_thread_routine_finished before calling the real pthread_exit
  • Make timer_create as default cpu profiler

Test coverage

Updated the tests

Other details

gleocadie avatar Jan 30 '25 10:01 gleocadie

Datadog Report

Branch report: gleocadie/another-try-at-gaing-timer_create Commit report: 916eaf5 Test service: dd-trace-dotnet

:x: 734 Failed (2 Known Flaky), 243094 Passed, 2558 Skipped, 18h 47m 5.61s Total Time

:x: Failed Tests (734)

This report shows up to 5 failed tests.

  • SubmitsTracesV0 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
    xpected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV0 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
    xpected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV0 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
    xpected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV1 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
    xpected exit code: 0, actual exit code: 139.
    
  • SubmitsTracesV1 - Datadog.Trace.ClrProfiler.IntegrationTests.AdoNet.DapperTests - Details

    Expand for error
    xpected exit code: 0, actual exit code: 139.
    

datadog-ddstaging[bot] avatar Jan 30 '25 11:01 datadog-ddstaging[bot]

Execution-Time Benchmarks Report :stopwatch:

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (69ms)  : 66, 72
     .   : milestone, 69,
    master - mean (69ms)  : 66, 73
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (1,007ms)  : 975, 1038
     .   : milestone, 1007,
    master - mean (1,004ms)  : 985, 1023
     .   : milestone, 1004,

gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (102ms)  : 100, 105
     .   : milestone, 102,
    master - mean (102ms)  : 100, 104
     .   : milestone, 102,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (687ms)  : 671, 704
     .   : milestone, 687,
    master - mean (687ms)  : 665, 709
     .   : milestone, 687,

gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (89ms)  : 87, 91
     .   : milestone, 89,
    master - mean (89ms)  : 87, 91
     .   : milestone, 89,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (647ms)  : 626, 667
     .   : milestone, 647,
    master - mean (643ms)  : 623, 662
     .   : milestone, 643,

gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (191ms)  : 186, 196
     .   : milestone, 191,
    master - mean (190ms)  : 186, 194
     .   : milestone, 190,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (1,109ms)  : 1075, 1143
     .   : milestone, 1109,
    master - mean (1,111ms)  : 1084, 1138
     .   : milestone, 1111,

gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (269ms)  : 265, 273
     .   : milestone, 269,
    master - mean (269ms)  : 265, 274
     .   : milestone, 269,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (875ms)  : 848, 902
     .   : milestone, 875,
    master - mean (876ms)  : 850, 901
     .   : milestone, 876,

gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6606) - mean (261ms)  : 258, 265
     .   : milestone, 261,
    master - mean (262ms)  : 259, 265
     .   : milestone, 262,

    section CallTarget+Inlining+NGEN
    This PR (6606) - mean (859ms)  : 825, 892
     .   : milestone, 859,
    master - mean (863ms)  : 823, 903
     .   : milestone, 863,

andrewlock avatar Jan 30 '25 11:01 andrewlock

Benchmarks Report for tracer :snail:

Benchmarks for #6606 compared to master:

  • 1 benchmarks are faster, with geometric mean 1.126
  • 1 benchmarks are slower, with geometric mean 1.168
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 8.2μs 43.3ns 225ns 0.0159 0.00797 0 5.61 KB
master StartStopWithChild netcoreapp3.1 10.4μs 57.2ns 319ns 0.00997 0 0 5.81 KB
master StartStopWithChild net472 16.1μs 57.3ns 222ns 1.03 0.299 0.097 6.22 KB
#6606 StartStopWithChild net6.0 7.86μs 41.3ns 268ns 0.0123 0.0041 0 5.61 KB
#6606 StartStopWithChild netcoreapp3.1 10.2μs 53.1ns 276ns 0.0192 0.0096 0 5.8 KB
#6606 StartStopWithChild net472 15.9μs 39.3ns 152ns 1.04 0.305 0.104 6.21 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 500μs 714ns 2.77μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 640μs 669ns 2.5μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 853μs 1.03μs 4μs 0.425 0 0 3.3 KB
#6606 WriteAndFlushEnrichedTraces net6.0 503μs 561ns 2.17μs 0 0 0 2.7 KB
#6606 WriteAndFlushEnrichedTraces netcoreapp3.1 649μs 297ns 1.07μs 0 0 0 2.7 KB
#6606 WriteAndFlushEnrichedTraces net472 844μs 1.12μs 4.35μs 0.419 0 0 3.31 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 128μs 449ns 1.74μs 0.189 0 0 14.47 KB
master SendRequest netcoreapp3.1 144μs 243ns 942ns 0.217 0 0 17.27 KB
master SendRequest net472 0.000649ns 0.000312ns 0.00121ns 0 0 0 0 b
#6606 SendRequest net6.0 132μs 612ns 2.37μs 0.197 0 0 14.47 KB
#6606 SendRequest netcoreapp3.1 147μs 430ns 1.67μs 0.219 0 0 17.28 KB
#6606 SendRequest net472 0.000612ns 0.00017ns 0.00066ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 575μs 3.04μs 16.3μs 0.563 0 0 41.55 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 695μs 3.98μs 32.8μs 0.319 0 0 41.88 KB
master WriteAndFlushEnrichedTraces net472 849μs 3.93μs 15.7μs 8.69 2.48 0.414 53.29 KB
#6606 WriteAndFlushEnrichedTraces net6.0 572μs 3.25μs 22.3μs 0.568 0 0 41.59 KB
#6606 WriteAndFlushEnrichedTraces netcoreapp3.1 689μs 3.96μs 29.6μs 0.345 0 0 41.72 KB
#6606 WriteAndFlushEnrichedTraces net472 834μs 3.67μs 14.2μs 8.33 2.5 0.417 53.3 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.33μs 1.53ns 5.74ns 0.0146 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.76μs 1.21ns 4.54ns 0.0131 0 0 1.02 KB
master ExecuteNonQuery net472 2μs 3.77ns 14.6ns 0.157 0.000998 0 987 B
#6606 ExecuteNonQuery net6.0 1.36μs 0.979ns 3.79ns 0.0142 0 0 1.02 KB
#6606 ExecuteNonQuery netcoreapp3.1 1.78μs 1.81ns 6.76ns 0.0134 0 0 1.02 KB
#6606 ExecuteNonQuery net472 2.13μs 3.06ns 11.4ns 0.156 0.00105 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.25μs 0.825ns 3.19ns 0.0133 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.55μs 0.772ns 2.89ns 0.0132 0 0 976 B
master CallElasticsearch net472 2.56μs 1.4ns 5.42ns 0.157 0 0 995 B
master CallElasticsearchAsync net6.0 1.36μs 0.503ns 1.74ns 0.0128 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.64μs 2.5ns 9.7ns 0.0139 0 0 1.02 KB
master CallElasticsearchAsync net472 2.76μs 1.28ns 4.95ns 0.166 0 0 1.05 KB
#6606 CallElasticsearch net6.0 1.23μs 0.649ns 2.43ns 0.0136 0 0 976 B
#6606 CallElasticsearch netcoreapp3.1 1.57μs 0.443ns 1.6ns 0.0134 0 0 976 B
#6606 CallElasticsearch net472 2.66μs 1.06ns 4.1ns 0.157 0 0 995 B
#6606 CallElasticsearchAsync net6.0 1.28μs 0.615ns 2.3ns 0.0134 0 0 952 B
#6606 CallElasticsearchAsync netcoreapp3.1 1.65μs 0.588ns 2.2ns 0.0132 0 0 1.02 KB
#6606 CallElasticsearchAsync net472 2.58μs 0.689ns 2.58ns 0.167 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.24μs 0.398ns 1.43ns 0.0131 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.72μs 0.545ns 2.04ns 0.0129 0 0 952 B
master ExecuteAsync net472 1.76μs 0.474ns 1.77ns 0.145 0 0 915 B
#6606 ExecuteAsync net6.0 1.33μs 0.647ns 2.42ns 0.0134 0 0 952 B
#6606 ExecuteAsync netcoreapp3.1 1.66μs 1.2ns 4.66ns 0.0125 0 0 952 B
#6606 ExecuteAsync net472 1.79μs 0.382ns 1.43ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.33μs 2.46ns 9.19ns 0.0326 0 0 2.31 KB
master SendAsync netcoreapp3.1 5.53μs 2.97ns 11.5ns 0.0386 0 0 2.85 KB
master SendAsync net472 7.53μs 2.65ns 10.2ns 0.493 0 0 3.12 KB
#6606 SendAsync net6.0 4.28μs 1.41ns 5.27ns 0.0321 0 0 2.31 KB
#6606 SendAsync netcoreapp3.1 5.29μs 4.15ns 16.1ns 0.0372 0 0 2.85 KB
#6606 SendAsync net472 7.68μs 1.45ns 5.63ns 0.494 0 0 3.12 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.54μs 1.34ns 5.03ns 0.024 0 0 1.69 KB
master EnrichedLog netcoreapp3.1 2.16μs 1.12ns 4.19ns 0.0221 0 0 1.69 KB
master EnrichedLog net472 2.6μs 1.48ns 5.75ns 0.258 0 0 1.62 KB
#6606 EnrichedLog net6.0 1.49μs 0.512ns 1.92ns 0.0238 0 0 1.69 KB
#6606 EnrichedLog netcoreapp3.1 2.34μs 2.96ns 11.1ns 0.0232 0 0 1.69 KB
#6606 EnrichedLog net472 2.53μs 1.11ns 4.29ns 0.258 0 0 1.62 KB
Benchmarks.Trace.Log4netBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 112μs 211ns 816ns 0.0558 0 0 4.3 KB
master EnrichedLog netcoreapp3.1 116μs 134ns 501ns 0 0 0 4.3 KB
master EnrichedLog net472 148μs 165ns 640ns 0.669 0.223 0 4.49 KB
#6606 EnrichedLog net6.0 112μs 88.4ns 319ns 0.057 0 0 4.3 KB
#6606 EnrichedLog netcoreapp3.1 117μs 130ns 505ns 0 0 0 4.3 KB
#6606 EnrichedLog net472 152μs 187ns 724ns 0.68 0.227 0 4.49 KB
Benchmarks.Trace.NLogBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.16μs 1.7ns 6.13ns 0.0316 0 0 2.25 KB
master EnrichedLog netcoreapp3.1 4.09μs 0.812ns 2.81ns 0.0308 0 0 2.25 KB
master EnrichedLog net472 4.87μs 2.45ns 9.15ns 0.328 0 0 2.07 KB
#6606 EnrichedLog net6.0 3.1μs 1.17ns 4.37ns 0.0309 0 0 2.25 KB
#6606 EnrichedLog netcoreapp3.1 4.2μs 2.17ns 8.41ns 0.0315 0 0 2.25 KB
#6606 EnrichedLog net472 4.79μs 1.75ns 6.76ns 0.328 0 0 2.07 KB
Benchmarks.Trace.RedisBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.31μs 0.571ns 2.14ns 0.0158 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.79μs 1.27ns 4.92ns 0.0153 0 0 1.14 KB
master SendReceive net472 2.09μs 0.933ns 3.61ns 0.183 0 0 1.16 KB
#6606 SendReceive net6.0 1.4μs 0.656ns 2.54ns 0.0161 0 0 1.14 KB
#6606 SendReceive netcoreapp3.1 1.76μs 0.708ns 2.74ns 0.015 0 0 1.14 KB
#6606 SendReceive net472 2.13μs 0.43ns 1.55ns 0.183 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed :heavy_check_mark: Same allocations :heavy_check_mark:

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.86μs 0.947ns 3.67ns 0.0228 0 0 1.62 KB
master EnrichedLog netcoreapp3.1 4.07μs 1.32ns 4.95ns 0.0223 0 0 1.67 KB
master EnrichedLog net472 4.34μs 2.43ns 9.4ns 0.328 0 0 2.07 KB
#6606 EnrichedLog net6.0 2.73μs 0.836ns 3.24ns 0.0232 0 0 1.62 KB
#6606 EnrichedLog netcoreapp3.1 3.91μs 0.896ns 3.35ns 0.0217 0 0 1.67 KB
#6606 EnrichedLog net472 4.64μs 2.94ns 11.4ns 0.326 0 0 2.07 KB
Benchmarks.Trace.SpanBenchmark - Faster :tada: Same allocations :heavy_check_mark:

Faster :tada: in #6606

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishScope‑netcoreapp3.1 1.126 760.19 674.94

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 420ns 0.129ns 0.481ns 0.00798 0 0 576 B
master StartFinishSpan netcoreapp3.1 589ns 0.345ns 1.29ns 0.00748 0 0 576 B
master StartFinishSpan net472 674ns 0.719ns 2.79ns 0.0918 0 0 578 B
master StartFinishScope net6.0 473ns 0.0729ns 0.273ns 0.0098 0 0 696 B
master StartFinishScope netcoreapp3.1 760ns 0.543ns 2.1ns 0.00933 0 0 696 B
master StartFinishScope net472 855ns 0.228ns 0.883ns 0.104 0 0 658 B
#6606 StartFinishSpan net6.0 448ns 0.158ns 0.612ns 0.00804 0 0 576 B
#6606 StartFinishSpan netcoreapp3.1 573ns 0.548ns 2.12ns 0.00774 0 0 576 B
#6606 StartFinishSpan net472 620ns 0.371ns 1.44ns 0.0917 0 0 578 B
#6606 StartFinishScope net6.0 483ns 0.136ns 0.528ns 0.00989 0 0 696 B
#6606 StartFinishScope netcoreapp3.1 675ns 0.352ns 1.32ns 0.00943 0 0 696 B
#6606 StartFinishScope net472 852ns 0.904ns 3.5ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Slower :warning: Same allocations :heavy_check_mark:

Slower :warning: in #6606

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin‑net6.0 1.168 669.70 782.17

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 670ns 0.208ns 0.804ns 0.00975 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 958ns 0.361ns 1.35ns 0.00909 0 0 696 B
master RunOnMethodBegin net472 1.05μs 0.445ns 1.67ns 0.104 0 0 658 B
#6606 RunOnMethodBegin net6.0 782ns 0.507ns 1.96ns 0.00966 0 0 696 B
#6606 RunOnMethodBegin netcoreapp3.1 906ns 0.2ns 0.747ns 0.00953 0 0 696 B
#6606 RunOnMethodBegin net472 1.1μs 0.446ns 1.73ns 0.104 0 0 658 B

andrewlock avatar Jan 30 '25 11:01 andrewlock

Snapshots difference summary

The following differences have been observed in committed snapshots. It is meant to help the reviewer. The diff is simplistic, so please check some files anyway while we improve it.

github-actions[bot] avatar Feb 06 '25 21:02 github-actions[bot]

Datadog Report

All test runs d0cf02f :link:

:white_check_mark: 2 Total Test Services: 0 Failed, 2 Passed

Test Services
Service Name Failed Known Flaky New Flaky Passed Skipped Total Time Test Service View
dd-trace-dotnet 0 0 0 171174 1824 14h 30m 59.95s Link
exploration_tests 0 0 0 22085 3 2m 9.59s Link