dd-trace-go [WIP] Fix race/flake with spans started/finished metrics

[WIP] Fix race/flake with spans started/finished metrics

Open hannahkm opened this issue 8 months ago • 2 comments

What does this PR do?

Fixes a data race caused by #3023.

Motivation

The data race can cause data to be lost and for tests to flake. We update the new XSyncMapCounterMap to use only atomic operations when increasing and reading from its counter to ensure we do not have a race. The trade off is a slight increase in time required to run these methods on the magnitude of ~10ns.

Reviewer's Checklist

[ ] Changed code has unit tests for its functionality at or near 100% coverage.
[ ] System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
[ ] There is a benchmark for any new code, or changes to existing code.
[ ] If this interacts with the agent in a new way, a system test has been added.
[ ] Add an appropriate team label so this PR gets put in the right place for the release notes.
[ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.
[ ] For internal contributors, a matching PR should be created to the v2-dev branch and reviewed by @DataDog/apm-go.

Unsure? Have a question? Request a review!

Mar 06 '25 15:03 hannahkm

Datadog Report

Branch report: hannahkm/fix-span-finished-tags Commit report: d8ad564 Test service: dd-trace-go

:white_check_mark: 0 Failed, 5423 Passed, 72 Skipped, 3m 11.76s Total Time

Mar 06 '25 15:03 datadog-datadog-prod-us1[bot]

Benchmarks

Benchmark execution time: 2025-03-18 21:10:27

Comparing candidate commit 2ce6e36487ed3a9d026d59ec1db1c0f1ada7756e in PR branch hannahkm/fix-span-finished-tags with baseline commit ffe26f7eb1fdc29a99c120342f73950082fcec6f in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 59 metrics, 2 unstable metrics.

scenario:BenchmarkStartSpanConcurrent-24

🟥 execution_time [+1.972µs; +2.477µs] or [+34.823%; +43.740%]

Mar 06 '25 22:03 pr-commenter[bot]

@hannahkm Is this PR still alive?

Nov 14 '25 15:11 darccio

@darccio I don't think so; another PR was merged to fix the flakes in these tests. I'll close this one!

Nov 14 '25 15:11 hannahkm

dd-trace-go dd-trace-go copied to clipboard

[WIP] Fix race/flake with spans started/finished metrics

What does this PR do?

Motivation

Reviewer's Checklist

Datadog Report

Benchmarks

scenario:BenchmarkStartSpanConcurrent-24

dd-trace-go
dd-trace-go copied to clipboard