dd-trace-go icon indicating copy to clipboard operation
dd-trace-go copied to clipboard

ddtrace/internal.GetGlobalTracer(): Use atomic.Value (91% less CPU)

Open evanj opened this issue 3 years ago • 2 comments

The GetGlobalTracer() function is called in many places. It currently uses sync.RWMutex to ensure the current value is read in a thread-safe fashion. We can replace this with an atomic.Value, which is substantially more efficient. According to one profile I have, a program that makes a lot of use of tracing spends approximately 0.5% of all CPU in this function, so this should be a very small improvement across all programs.

I tested with two benchmarks: A serial benchmark, which calls GetGlobalTracer using a single goroutine, and a parallel benchmark, which calls it from all CPUs at the same time. On my 8 core Macbook, I get the following results, which show this uses about 91% less CPU in the serial, uncontended case, or 98% less CPU in the contended, parallel case. I think the serial, uncontended results are more representative.

name                       old time/op    new time/op    delta
GetGlobalTracerSerial-8      13.2ns ± 2%     1.1ns ± 5%  -91.59%  (p=0.000 n=10+10)
GetGlobalTracerParallel-8    31.9ns ±15%     0.3ns ± 5%  -98.91%  (p=0.000 n=10+10)

name                       old alloc/op   new alloc/op   delta
GetGlobalTracerSerial-8       0.00B          0.00B          ~     (all equal)
GetGlobalTracerParallel-8     0.00B          0.00B          ~     (all equal)

name                       old allocs/op  new allocs/op  delta
GetGlobalTracerSerial-8        0.00           0.00          ~     (all equal)
GetGlobalTracerParallel-8      0.00           0.00          ~     (all equal)

evanj avatar Apr 05 '22 17:04 evanj

Nice optimization! This change unfortunately requires Go 1.17 for atomic.(*Value).Swap. Even if we adopt our new Go version support policy, we're still likely going to support Go 1.16 until Go 1.19 comes out. There are a few ways I think we could go here:

  • We could still use this change, but have two implementations of the global tracer stuff in different files that are tagged with //go:build go1.17 and //go:build !go1.17. Then people using recent Go releases get this better implementation, and we can drop the old implementation once we drop support for older Go versions.
  • Use the older atomic.SwapPointer & friends. This wouldn't be quite as nice as the atomic.Value implementation. I don't think there's anything wrong with that in principle but maybe users might squirm at the unsafe requirement?

Going the second way could look like this:

var globalTracer unsafe.Pointer

func init() {
	var tracer ddtrace.Tracer = &NoopTracer{}
	atomic.StorePointer(&globalTracer, unsafe.Pointer(&tracer))
}

func SetGlobalTracer(t ddtrace.Tracer) {
	old := (*ddtrace.Tracer)(atomic.SwapPointer(&globalTracer, unsafe.Pointer(&t)))
	if !Testing {
		(*old).Stop()
	}
}

func GetGlobalTracer() ddtrace.Tracer {
	return *(*ddtrace.Tracer)(atomic.LoadPointer(&globalTracer))
}

Any thoughts on what's better?

nsrip-dd avatar Apr 20 '22 13:04 nsrip-dd

Sorry for the delay, I was on parental leave. Thanks for the careful reviews! I did not notice that atomic.Value.Swap is recent. At this point, maybe another answer is to wait until August when Go 1.19 is released, assuming the support policy will drop Go 1.16 then?

I like your SwapPointer solution also. I think it should be equivalent to the Value.Swap solution? Either of those solutions seem good to me. I'm happy to update this PR to the SwapPointer solution, or just wait, whatever the maintainers would prefer.

evanj avatar Jun 06 '22 15:06 evanj

Go 1.19 has been released and we've moved support for go1.16 to legacy so I believe we should be safe to move forward with this change now :)

ajgajg1134 avatar Sep 22 '22 17:09 ajgajg1134