opentelemetry-js
opentelemetry-js copied to clipboard
fix(tracing): make spans resilient to performance clock drift
This is a simpler alternative to #3384
Described by the comment here: https://github.com/open-telemetry/opentelemetry-js/issues/3279#issuecomment-1320064182
Fixes #3355
Codecov Report
Merging #3434 (830c799) into main (2dcc898) will increase coverage by
0.02%. The diff coverage is96.96%.
Additional details and impacted files
@@ Coverage Diff @@
## main #3434 +/- ##
==========================================
+ Coverage 93.78% 93.80% +0.02%
==========================================
Files 249 249
Lines 7621 7637 +16
Branches 1588 1589 +1
==========================================
+ Hits 7147 7164 +17
+ Misses 474 473 -1
| Impacted Files | Coverage Δ | |
|---|---|---|
| packages/opentelemetry-sdk-trace-base/src/Span.ts | 98.58% <95.65%> (-1.42%) |
:arrow_down: |
| ...s/opentelemetry-instrumentation-fetch/src/fetch.ts | 97.02% <100.00%> (+0.01%) |
:arrow_up: |
| ...emetry-instrumentation-xml-http-request/src/xhr.ts | 97.59% <100.00%> (+0.01%) |
:arrow_up: |
| packages/opentelemetry-core/src/common/time.ts | 98.55% <100.00%> (+2.96%) |
:arrow_up: |
| ...-trace-base/src/platform/node/RandomIdGenerator.ts | 93.75% <0.00%> (+6.25%) |
:arrow_up: |
like previous pr, codesandbox with the changes, artificial drift and most common instrumentations (with already fixed the _performanceOffset issue in first comment)
The test (Span # should have an entered time for event) is unstable with the change.
The test (Span # should have an entered time for event) is unstable with the change.
Yeah I see that. I'm trying to work on it now. ~~I think it was related to changing span end to be a Date~~
edit: I see the test relates to events not end. Still looking into it
Looks like the test was providing time 0 to startTime for the span which caused the span start to use Date.now to generate a start time. It then provided 123 to the event time which caused a drift correction. The test was flaky because it depended on the drift correction being millisecond accurate.
@legendecas should be stable now
@t2t2 I updated the web instrumentations to use Date instead of performance.now to get span end times which I think should address your concerns. I would appreciate your review because you've been very helpful. Hopefully we can get this merged soon.