opentelemetry-js
opentelemetry-js copied to clipboard
feat: allow Resource to be created with some attributes resolving asynchronously
Which problem is this PR solving?
Fixes #2912
Short description of the changes
- In
@opentelemetry/sdk-node- Adds an optional
Promise<ResourceAttributes>to the Resource constructor which can asynchronously resolve some attributes. When this promise resolves, the attributes will be merged with the "synchronous" attributes. - Deprecated
detectResources()in favor of a new methoddetectResourcesSync()which defers promises into the resource to be resolved later - Update
Detector.detect()signature to allow it to be synchronous and updated documentation that the synchronous variant should be used
- Adds an optional
- In
@opentelemetry/sdk-node, updatedNodeSDK.start()and its dependencies to now be synchronous (doesn't return a promise). This is a breaking change, but that package is experimental. - Updated
BatchSpanProcessorBaseandSimpleSpanProcessorto await the resource promise before calling exporters, if the resource has asynchronous attributes AND they have not already resolved. I.e. if there is no promise in the Resource (the case for existing detectors and code), the behavior is unchanged.
There may be some test breakages in contrib because of the nature of asynchronous tests on fire-and-forget span.end(). Users shouldn't see any behavior change.
Type of change
Please delete options that are not relevant.
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) I don't think so?
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
- [x] Existing tests pass with no modification
- [x] Added new tests to
Resource.test.ts - [x] Added
detect-resources.test.ts
Checklist:
- [x] Followed the style guidelines of this project
- [x] Unit tests have been added
- [x] Documentation has been updated
Codecov Report
Merging #2933 (8435693) into main (3cc40d7) will decrease coverage by
1.32%. The diff coverage is96.55%.
:exclamation: Current head 8435693 differs from pull request most recent head eb6446f. Consider uploading reports for the commit eb6446f to get more accurate results
@@ Coverage Diff @@
## main #2933 +/- ##
==========================================
- Coverage 92.64% 91.32% -1.33%
==========================================
Files 187 68 -119
Lines 6169 1878 -4291
Branches 1303 399 -904
==========================================
- Hits 5715 1715 -4000
+ Misses 454 163 -291
| Impacted Files | Coverage Δ | |
|---|---|---|
| packages/opentelemetry-resources/src/Resource.ts | 97.50% <96.55%> (-2.50%) |
:arrow_down: |
| packages/opentelemetry-resources/karma.worker.js | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
| ...ckages/opentelemetry-sdk-trace-web/karma.worker.js | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
| ...s/opentelemetry-core/src/platform/node/sdk-info.ts | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
| ...opentelemetry-core/src/platform/node/globalThis.ts | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
| ...pentelemetry-core/src/platform/node/performance.ts | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
| ...emetry-api-metrics/src/platform/node/globalThis.ts | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
| ...lemetry-resources/src/detectors/ProcessDetector.ts | 31.81% <0.00%> (-68.19%) |
:arrow_down: |
| ...perimental/packages/otlp-exporter-base/src/util.ts | 79.41% <0.00%> (-17.65%) |
:arrow_down: |
| ...ckages/opentelemetry-exporter-zipkin/src/zipkin.ts | 84.48% <0.00%> (-15.52%) |
:arrow_down: |
| ... and 120 more |
I really like this proposal - to propagate the async task down to resource attribute consumers. This is a backward-compatible (I think) and very clean way to address the issue.
The promise means that every time we need to access this resource attributes, the caller will have to await to get the full attributes values.
I wonder if that is a constraint that is easy to fulfill.
For example, in aspecto's distro we read the resource attributes in a span processor and use the values to apply filtering rules on span attributes. This is done in the onEnd invocation which is a sync function of course.
I guess we could wrap it with some mechanism that aggregates the processor spans until the resource is resolved, but that is not trivial.
Another possible issue is that we cannot have any guarantee on how long it will take for the promise to resolve. Until it resolves, spans might be piling up in the exporter export invocations, all waiting for the resource to be resolved so they can be sent on the wire.
Really interesting approach, looking forward to discussing this more.
According to spec resource is immutable. I think delaying the read until first use in e.g. exporter sounds to me like it's still spec conform.
I don't think the await can be avoided by any wrapping/API we add in common code. A consumer awaits on first use and may store the result for future use (maybe already serialized/compressed/... to avoid to redo this on every export).
I wonder if top level await + esm preload would be able to achieve this as well? @aabmass / @Flarna , do you know?
EDIT
Played around with it a bit. For ESM, adding an loader module would work and is able to asyncronously wait for stuff to resolve. The thing is, it's only for ESM.
For CJS, there's no TLA, so we still need a solution for today.
Played around with it a bit. For ESM, adding an loader module would work and is able to asyncronously wait for stuff to resolve. The thing is, it's only for ESM.
That's quite cool. Is the limitation of only a single loader module removed yet?
I think top Level await can be used but as far as I know it works only in ECMAScript module mode not for commonJs. I requires node.js 14.8 as far as I know. Not sure about browsers. Once we are in module mode we have to use loaders which are still experimental.
The limitation to have only single loader is gone on node master (see here), backport to 18.x has not yet happened as some other breaking changes in loaders are planned and target is to combine all breaking changes in a single 18.x release.
I guess now span processors are expected to handle these async promises? We already have some users which have implemented custom span processors. I guess this is considered breaking for them? What does the spec say about breaking changes for sdk interface implementers?
I would consider this also as a breaking change so this would be semver major.
This PR actually introduces another problem from the specification:
OnEnd(Span)
OnEnd is called after a span is ended (i.e., the end timestamp is already set). This method MUST be called synchronously within the Span.End() API, therefore it should not block or throw an exception.
https://github.com/open-telemetry/opentelemetry-specification/blob/9abbdd39d0b35f635f833f072013431da419894e/specification/trace/sdk.md#onendspan
This PR actually introduces another problem from the specification:
OnEnd(Span)
OnEnd is called after a span is ended (i.e., the end timestamp is already set). This method MUST be called synchronously within the Span.End() API, therefore it should not block or throw an exception.
open-telemetry/opentelemetry-specification@
9abbdd3/specification/trace/sdk.md#onendspan
I actually realized this isn't a problem. If it was, then the batch span processor would be not compliant.
This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
This PR was closed because it has been stale for 14 days with no activity.