opentelemetry-go icon indicating copy to clipboard operation
opentelemetry-go copied to clipboard

Add WithClock TracerProviderOption

Open Frefreak opened this issue 4 years ago • 14 comments
trafficstars

details please see #2043

Frefreak avatar Jul 03 '21 09:07 Frefreak

Codecov Report

Merging #2052 (e5c647d) into main (0c1f156) will increase coverage by 0.0%. The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #2052   +/-   ##
=====================================
  Coverage   76.0%   76.0%           
=====================================
  Files        174     174           
  Lines      12088   12105   +17     
=====================================
+ Hits        9187    9204   +17     
  Misses      2656    2656           
  Partials     245     245           
Impacted Files Coverage Δ
sdk/trace/provider.go 82.6% <100.0%> (+0.7%) :arrow_up:
sdk/trace/span.go 82.2% <100.0%> (+0.1%) :arrow_up:
sdk/trace/time.go 100.0% <100.0%> (ø)
sdk/trace/tracer.go 100.0% <100.0%> (ø)

codecov[bot] avatar Jul 03 '21 18:07 codecov[bot]

Thanks for all the suggestions & comments. Code has been updated.

Frefreak avatar Jul 06 '21 03:07 Frefreak

godoc & others updated. For the benchmark, during development I ran 2 comparison check to see the performance impact:

default clock

One is to compare the default clock against original implementation (git checkout main). The difference seems negligible or hard to differentiate (obtained by running go test -run=xxxxxMatchNothingxxxxx -test.benchtime=1s -bench=. under sdk/trace). Result of one run is below: new clock (NeverSample and DotString parts are omitted to make it shorter):

pkg: go.opentelemetry.io/otel/sdk/trace                                                                                              
cpu: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz                                                                                         
BenchmarkAttributesMapToKeyValue-8        532833              2194 ns/op                                                             
BenchmarkStartEndSpan/AlwaysSample-8     1380988               869.2 ns/op           848 B/op          9 allocs/op                            
BenchmarkSpanWithAttributes_4/AlwaysSample-8              824190              1444 ns/op            1584 B/op         17 allocs/op     
BenchmarkSpanWithAttributes_8/AlwaysSample-8              618087              1936 ns/op            2112 B/op         23 allocs/op    
BenchmarkSpanWithAttributes_all/AlwaysSample-8            784983              1794 ns/op            1936 B/op         21 allocs/op   
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         395304              3125 ns/op            3236 B/op         32 allocs/op                                           
PASS                                                                                                                                 
ok      go.opentelemetry.io/otel/sdk/trace      21.784s

original (main branch):

pkg: go.opentelemetry.io/otel/sdk/trace
cpu: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
BenchmarkAttributesMapToKeyValue-8        501423              2167 ns/op
BenchmarkStartEndSpan/AlwaysSample-8     1429974               859.0 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              965439              1473 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              594805              1935 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            638281              1812 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         399723              3103 ns/op            3236 B/op         32 allocs/op
PASS
ok      go.opentelemetry.io/otel/sdk/trace      20.962s

custom clock which add "offset"

The other check is to run against a custom clock in another repo (which is the original motivation that we want to add this clock abstraction). This clock has an clockOffset that will be added to every time.Now, which is updated periodically (several minutes or hours) in another goroutine so the overhead of updating offset can be ignore. In this clock implementation the "ns/op" stat does increase a bit, and I think this "offset" clock is a typical implementation, but after all that is custom logic. Should we include this in the benchmark to compare the default clock with an "offset" clock?

Here is a snippet of the test logic (modified from benchmark_test.go):

func tracerWithNTPClock(b *testing.B, name string, sampler sdktrace.Sampler) trace.Tracer {                                          
    // clock initialization
    clock := ...                                                                                                                                                                                                
    tp := sdktrace.NewTracerProvider(sdktrace.WithSampler(sampler), sdktrace.WithClock(&clock))                                      
    return tp.Tracer(name)                                                                                                           
}

func traceBenchmark(b *testing.B, name string, fn func(*testing.B, trace.Tracer)) {                                                  
    b.Run("AlwaysSample", func(b *testing.B) {                                                                                       
        b.ReportAllocs()                                                                                                             
        fn(b, tracer(b, name, sdktrace.AlwaysSample()))                                                                              
    })                                                                                                                               
    b.Run("AlwaysSampleWithNTPClock", func(b *testing.B) {                                                                           
        b.ReportAllocs()                                                                                                             
        fn(b, tracerWithNTPClock(b, name, sdktrace.AlwaysSample()))                                                                  
    })                                                                                                                                                                                                                                                             
}

Frefreak avatar Jul 08 '21 00:07 Frefreak

@MrAlias Thanks for the suggestion, here's what I have in mind. For a typical offset clock with the current Clock interface, we can ensure monotonic start/end times as long as the offset does not change, because the custom Since method will add the same offset back to compensate the offset added to startTime (i.e. time.Now()- startTime + offset). The problem arise when the offset is updated during a span. For a huge offset update the time might even go backwards. But still, a custom Since could detect negative duration and return a zero to slightly remedy this.

Carrying a state indeed can ensure consistent offset applied to a span's start time and end time. For an offset clock just let the Now returns an offset and pass it to Since is sufficient. But this requires saving the state somewhere (maybe a new field in span?). If that's acceptable, What about the following interface? This should gives sufficient freedom to any clock implementation to ensure the monotonic start/end time generation. For the standard default clock, it can return a nil in Now and just ignores it in Since. For an offset clock the state can just be a time.Duration.

type Clock interface {
    Now (time.Time, interface{})
    Since(interface{}, time.Time) time.Duration
}

Would this be too generalized? (and BTW I think the safety of using interface{} should be ensured by the clock implementation. go don't have type parameter in interface definition, right?)

Frefreak avatar Jul 08 '21 19:07 Frefreak

@MrAlias @Frefreak I need to go asleep but won't the following interface solve the problem better?

type Clock interface {
    Now() time.Time // default implementation: time.Now()
    End(start time.Time) time.Time // default implementation: start.Add(time.Since(start)) 
}

And probably it would be better if both methods are explicitly implemented.

pellared avatar Jul 08 '21 20:07 pellared

type Clock interface {
    Now (time.Time, interface{})
    Since(interface{}, time.Time) time.Duration
}

I am not in favor of using an interface{} type. It circumvents the Go typing system and provides not compile time guarantees that the same return value from Now is passed to Since, something the solution I proposed does not suffer from.

MrAlias avatar Jul 08 '21 22:07 MrAlias

type Clock interface {
    Now() time.Time // default implementation: time.Now()
    End(start time.Time) time.Time // default implementation: start.Add(time.Since(start)) 
}

This suffers from the same problem as the solution currently implemented in this PR. If the offset is updated between the start and end it can cause non-monotonic start/end times.

MrAlias avatar Jul 08 '21 22:07 MrAlias

... this requires saving the state somewhere (maybe a new field in span?).

Correct, the solution at its core will require a span be linked somehow to both a timestamp (time.Time) and a duration (not just a timestamp which is currently the case). How this is linked needs to be carefully evaluated and the trade-offs considered. I do not think this link between the span and timestamp/duration should be stored outside of the span as it will involve complex lookup and memory clearing logic. Meaning it should instead be included with the span itself like you are suggesting.

From the linked example I see no reason why a span could not be updated to track the associated Stopwatch returned during it's creation. For the default case this will be a wrapper around time.Since(start) which could be implemented as a method on the start time itself (and I think avoid any heap allocations). E.g.

type startTime time.Time

func (t startTime) Elapsed() time.Duration {
	return time.Since(time.Time(t))
}

type defaultClock struct{}

func (defaultClock) Start() (time.Time, Stopwatch) {
	t := time.Now()
	return t, startTime(t)
}

MrAlias avatar Jul 08 '21 22:07 MrAlias

I have made some modifications regarding the suggested direction given by @MrAlias . Probably there are still many places that can be improved, though. To summary a few important points regarding the change:

  1. interface now looks like this:
type Clock interface {
	Start() (time.Time, Stopwatch)
}

type Stopwatch interface {
	Stop(time.Time) time.Time
}

I think extracting the startTime out and pass it to stopwatch as parameter would make the interface easier to use. Because most of the implementation of stopwatch will use this startTime anyways, and since the stopwatch will be stored along side startTime in the span there's no need requiring the custom stopwatch to embed the startTime info itself. Also I make Stopwatch.Stop returns the end time directly to reduce some unnecessary Add, Sub operation (otherwise there will be a scenario where Stop may doing some Sub and the caller Add-ed the elapsed time back to the startTime). As a result original monotonicEndTime is removed and its logic is used in standardClock/standardStopwatch.

  1. added a check for span.stopwatch == nil in span End function
  2. made some modification to "doc"
  3. made some benchmark of this and current main (go test -run=xxxxxMatchNothingxxxxx -test.benchtime=1s -bench=. | grep AlwaysSample):

clock (this PR) run 1

BenchmarkStartEndSpan/AlwaysSample-8     1239268               927.0 ns/op           880 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              673768              1502 ns/op            1616 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              565441              2115 ns/op            2144 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            598950              1853 ns/op            1968 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         395683              3267 ns/op            3268 B/op         32 allocs/op

run 2

BenchmarkStartEndSpan/AlwaysSample-8     1286404               948.9 ns/op           880 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              906268              1535 ns/op            1616 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              709358              2129 ns/op            2144 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            606974              1849 ns/op            1968 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         320420              3289 ns/op            3268 B/op         32 allocs/op

run 3

BenchmarkStartEndSpan/AlwaysSample-8     1340974               889.9 ns/op           880 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              940291              1482 ns/op            1616 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              588006              2089 ns/op            2144 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            592368              1948 ns/op            1968 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         315806              3298 ns/op            3268 B/op         32 allocs/op

main run 1

BenchmarkStartEndSpan/AlwaysSample-8     1362858               850.6 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              935188              1520 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              516552              2263 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            772426              1902 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         382783              3249 ns/op            3236 B/op         32 allocs/op

run 2

BenchmarkStartEndSpan/AlwaysSample-8     1378478               942.9 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              716536              1556 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              563935              2081 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            765418              2000 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         344926              3303 ns/op            3236 B/op         32 allocs/op

run 3

BenchmarkStartEndSpan/AlwaysSample-8     1328388               946.2 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              719095              1542 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              681996              2059 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            687136              1957 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         437308              3332 ns/op            3236 B/op         32 allocs/op

Frefreak avatar Jul 09 '21 07:07 Frefreak

I think we are trending in the correct direction. :smile:

type Clock interface {
	Start() (time.Time, Stopwatch)
}

type Stopwatch interface {
	Stop(time.Time) time.Time
}

I think extracting the startTime out and pass it to stopwatch as parameter would make the interface easier to use. Because most of the implementation of stopwatch will use this startTime anyways, and since the stopwatch will be stored along side startTime in the span there's no need requiring the custom stopwatch to embed the startTime info itself.

Extracting the start time out of the Stopwatch and requiring it be passed as a parameter not only adds responsibility to the user to store two values for ultimately a single purpose, it introduces a bug where the wrong start time is passed to the Stop method. I would prefer to encapsulate the start time in the Stopwatch instead, in fact I think by doing this and exposing a method to access it we can simplify the Clock interface and only store the Stopwatch in the span instead of it and a start time.

Also I make Stopwatch.Stop returns the end time directly to reduce some unnecessary Add, Sub operation (otherwise there will be a scenario where Stop may doing some Sub and the caller Add-ed the elapsed time back to the startTime). As a result original monotonicEndTime is removed and its logic is used in standardClock/standardStopwatch.

Having this return a time will introduce the possibility for users to compare times with different monotonic clocks. If a Stopwatch.Stop returns a time that has an accurate wall clock but not an accurate monotonic clock when the span is exported to a back-end that records start time and duration it will record the wrong duration when it subtracts the end time from the start time. By having the Stopwatch return a duration which the SDK can ensure sets the end time's monotonic clock appropriately this will not be an issue. This seems warranted even if there will be Adds to compliment Subs, this is just a byproduct of the Go time package's handling of monotonic time as I see it.

Based on this feedback, I would recommend using something more along these lines.

type Clock interface {
	// Stopwatch returns a started Stopwatch measuring a time interval.
	Stopwatch() Stopwatch
}

type Stopwatch interface {
	// Started returns the time the Stopwatch was started
	Started() time.Time
	// Elapsed returns the duration from when this Stopwatch was started and now using a consistent monotonic time.
	Elapsed() time.Duration
}

This way the span can store just the created Stopwatch instead of having to track it and a start time. I would recommend adding an unexported startTime() method to the span. E.g.

func (s span) startTime() time.Time {
	if s.stopwatch != nil {
		return s.stopwatch.Started()
	}
	return time.Time{}
}

This method can then replace the use of the current field.

MrAlias avatar Jul 09 '21 19:07 MrAlias

Although I personally still prefer the previous interface because I think the correctness of endtime (when Stop called) should leaves to the custom clock implementation (for default clock it should behave like before), your suggested interface does have its own advantages. I have modified the code according your snippet. Now the test passed but I'm a little worried about the overhead introduced by this abstraction, overall it seems the ns/op and allocs have a noticeable increase.

clock (this PR)

BenchmarkStartEndSpan/AlwaysSample-8     1195243              1029 ns/op             896 B/op         11 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              837748              1634 ns/op            1632 B/op         19 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              425994              2435 ns/op            2160 B/op         25 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            680662              2080 ns/op            1984 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         328867              3386 ns/op            3284 B/op         34 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1255443               996.4 ns/op           896 B/op         11 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              734005              1667 ns/op            1632 B/op         19 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              545829              2293 ns/op            2160 B/op         25 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            706772              2005 ns/op            1984 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         344545              3320 ns/op            3284 B/op         34 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1000000              1018 ns/op             896 B/op         11 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              693843              1589 ns/op            1632 B/op         19 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              494343              2251 ns/op            2160 B/op         25 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            700088              1969 ns/op            1984 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         376003              3559 ns/op            3284 B/op         34 allocs/op

main

BenchmarkStartEndSpan/AlwaysSample-8     1333357               900.7 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              950784              1536 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              521936              1980 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            546112              1873 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         319387              3258 ns/op            3236 B/op         32 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1000000              1012 ns/op             848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              687025              1574 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              586437              2058 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            745182              1865 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         316908              3292 ns/op            3236 B/op         32 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1351084               889.6 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              756207              1517 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              578827              2028 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            637530              1975 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         350178              3346 ns/op            3236 B/op         32 allocs/op

Do you think its acceptable or do you spot any optimization that can be applied? @MrAlias

Frefreak avatar Jul 10 '21 08:07 Frefreak

Updated. Here's the current benchmark result. From a quick look there seems to have a constant 30~70 ns/op overhead (maximum ~10% for simple span, maximum ~3% for span with 10 attributes) and one more allocation introduced.

main:

BenchmarkStartEndSpan/AlwaysSample-8     1632331               746.1 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              928948              1278 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              696939              1743 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            767371              1556 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         446376              2675 ns/op            3236 B/op         32 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1623552               732.8 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              973208              1292 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              720900              1682 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            788205              1542 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         444885              2704 ns/op            3236 B/op         32 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1629579               735.2 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              913605              1258 ns/op            1584 B/op         17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              732787              1685 ns/op            2112 B/op         23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            794674              1525 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         424996              2746 ns/op            3236 B/op         32 allocs/op

clock:

BenchmarkStartEndSpan/AlwaysSample-8     1495801               798.3 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              912394              1548 ns/op            1608 B/op         18 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              669765              1791 ns/op            2136 B/op         24 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            758274              1591 ns/op            1960 B/op         22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         442693              2768 ns/op            3260 B/op         33 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1511936               805.7 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              939487              1309 ns/op            1608 B/op         18 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              698484              1764 ns/op            2136 B/op         24 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            758954              1592 ns/op            1960 B/op         22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         439455              2747 ns/op            3260 B/op         33 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8     1521274               788.2 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              938434              1303 ns/op            1608 B/op         18 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              684704              1752 ns/op            2136 B/op         24 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            772917              1605 ns/op            1960 B/op         22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         424970              2794 ns/op            3260 B/op         33 allocs/op

Frefreak avatar Jul 21 '21 08:07 Frefreak

I added a parameter to clock's Stopwatch method (for a standard default clock this basically moves the IsZero check from the caller into the clock's Stopwatch method). For a custom clock the behavior would depends on the implementation (since for some custom clock it may decide that a passed in time makes not sense).

I also thought about adding a similar parameter to Stopwatch.Stop too, but from the current End function it would need to move some logic after the IsRecoding check to the beginning of the function, which kinds of defeat the purpose of calculating an end time as soon as possible. Besides the current logic already achieves what we want: use the user provided end time if its provided. So I just leaves it as is.

Current benchmark (3 rounds each branch): main branch:

BenchmarkStartEndSpan/AlwaysSample-8     1564064               785.6 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              924451              1330 ns/op            1584 B/op          17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              643656              1769 ns/op            2112 B/op          23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            794772              1537 ns/op            1936 B/op          21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         452740              2678 ns/op            3236 B/op          32 allocs/op

BenchmarkStartEndSpan/AlwaysSample-8     1542901               760.7 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              900636              1265 ns/op            1584 B/op          17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              720046              1697 ns/op            2112 B/op          23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            774172              1558 ns/op            1936 B/op          21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         453163              2698 ns/op            3236 B/op          32 allocs/op

BenchmarkStartEndSpan/AlwaysSample-8     1562756               759.7 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              898989              1280 ns/op            1584 B/op          17 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              718977              1697 ns/op            2112 B/op          23 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            800170              1561 ns/op            1936 B/op          21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         449101              2678 ns/op            3236 B/op          32 allocs/op

clock branch (which is the default clock):

BenchmarkStartEndSpan/AlwaysSample-8     1483533               799.9 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              844624              1335 ns/op            1608 B/op           18 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              692430              1753 ns/op            2136 B/op           24 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            737242              1601 ns/op            1960 B/op           22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         442994              2743 ns/op            3260 B/op           33 allocs/op

BenchmarkStartEndSpan/AlwaysSample-8     1494285               800.3 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              920941              1306 ns/op            1608 B/op           18 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              698163              1752 ns/op            2136 B/op           24 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            719253              1599 ns/op            1960 B/op           22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         448485              2763 ns/op            3260 B/op           33 allocs/op

BenchmarkStartEndSpan/AlwaysSample-8     1491642               805.5 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              941373              1319 ns/op            1608 B/op           18 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              698836              1784 ns/op            2136 B/op           24 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            767616              1595 ns/op            1960 B/op           22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         446974              2746 ns/op            3260 B/op           33 allocs/op

Since there were some refactors happened on the main branch, a more thorough review might be necessary in case I missed something important.

Frefreak avatar Sep 14 '21 07:09 Frefreak

Sorry for the long pause from last update. We still think this feature might be useful in some cases (although to be honest not very useful to most people). I just merged the current main into my branch. But since the discussion of this PR has been already long and its a long time since the first time it submitted, should I reopen a PR for a clean review or is it okay to continue here?

BTW this is the current benchmark for a comparison before & after this PR:

main:

BenchmarkAttributesMapToKeyValue-8                598554              1932 ns/op
BenchmarkSpanProcessor-8                          105206             11924 ns/op           12592 B/op        105 allocs/op
BenchmarkSpanProcessorVerboseLogging-8             99156             11508 ns/op           13736 B/op        111 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8             1447701               858.8 ns/op           848 B/op          9 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              822363              1461 ns/op            1760 B/op         19 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              592827              2077 ns/op            2464 B/op         27 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            791739              1563 ns/op            1936 B/op         21 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         451161              2720 ns/op            3236 B/op         32 allocs/op
BenchmarkSpanWithEvents_4/AlwaysSample-8                  737056              1666 ns/op            1536 B/op         20 allocs/op
BenchmarkSpanWithEvents_8/AlwaysSample-8                  503560              2387 ns/op            2240 B/op         29 allocs/op
BenchmarkSpanWithEvents_WithStackTrace/AlwaysSample-8            1000000              1045 ns/op            1024 B/op         13 allocs/op
BenchmarkSpanWithEvents_WithTimestamp/AlwaysSample-8             1000000              1032 ns/op            1048 B/op         14 allocs/op

clock:

BenchmarkAttributesMapToKeyValue-8                597698              1813 ns/op
BenchmarkSpanProcessor-8                           99366             11682 ns/op           12832 B/op        115 allocs/op
BenchmarkSpanProcessorVerboseLogging-8            101935             11701 ns/op           13976 B/op        121 allocs/op
BenchmarkStartEndSpan/AlwaysSample-8             1479482               809.1 ns/op           872 B/op         10 allocs/op
BenchmarkSpanWithAttributes_4/AlwaysSample-8              794362              1754 ns/op            1784 B/op         20 allocs/op
BenchmarkSpanWithAttributes_8/AlwaysSample-8              571761              2154 ns/op            2488 B/op         28 allocs/op
BenchmarkSpanWithAttributes_all/AlwaysSample-8            738685              1627 ns/op            1960 B/op         22 allocs/op
BenchmarkSpanWithAttributes_all_2x/AlwaysSample-8         438031              2765 ns/op            3260 B/op         33 allocs/op
BenchmarkSpanWithEvents_4/AlwaysSample-8                  718692              1666 ns/op            1560 B/op         21 allocs/op
BenchmarkSpanWithEvents_8/AlwaysSample-8                  500578              2427 ns/op            2264 B/op         30 allocs/op
BenchmarkSpanWithEvents_WithStackTrace/AlwaysSample-8            1000000              1055 ns/op            1048 B/op         14 allocs/op
BenchmarkSpanWithEvents_WithTimestamp/AlwaysSample-8             1000000              1067 ns/op            1072 B/op         15 allocs/op

Frefreak avatar Jan 19 '22 09:01 Frefreak