opentelemetry-js
opentelemetry-js copied to clipboard
active context not added to telemetry
What happened?
Steps to Reproduce
If you look though the startSpan() code in the tracer, at no point do the context attributes get evaluated for inclusion into the trace information. Instead it looks to options for attributes, which I don't understand
Expected Result
Correlation ID should be logged out
Actual Result
attributes = {}
Additional Details
OpenTelemetry Setup Code
static CORRELATION_ID = api.createContextKey("correlationId");
static HOST_NAME = api.createContextKey("net.host.name");
static ENV = api.createContextKey("app.env");
let base = api.context.active();
this.context = base.setValue(OpenTelemetry.CORRELATION_ID, Context.get("Logging-CorrelationId"))
.setValue(OpenTelemetry.HOST_NAME, agent.server.name)
.setValue(OpenTelemetry.ENV, process.env.NODE_ENV);
...
let span = tracer.startSpan(`${this.name}.${type}`, {});
package.json
"@opentelemetry/api": "~1.1.0",
"@opentelemetry/exporter-trace-otlp-http": "~0.33.0",
"@opentelemetry/sdk-trace-base": "~1.6.0",
Relevant log output
{
traceId: 'c3c624a0b67981bee888273a3a8cc6c8',
parentId: undefined,
name: 'request.get',
id: '3979cf0eaace930f',
kind: 0,
timestamp: 1664820678226331,
duration: 1288183,
attributes: {},
status: { code: 1 },
events: [],
links: []
}
The behavior is as expected. Why do you think that startSpan() should add something from context into a span?
context is used to propagate information across async invocations, a bit similar to thread local in Java/C++. It's used by tracer to find the parent span and it's used by propagator to propagate between processes.
State tracking is not explained anywhere in the documentation. And the examples that try to explain inheritance don't actually explain it. Essentially it's tribal knowledge.
There's a relationship in the docs between context and spans, and between spans, that I haven't seen demonstrated in any of the code I've written so far.
A Span has a parentSpanId that is always blank because context-utils.setSpan() returns a new context, which does not interact in any way, shape, or form with context.active() (which is called by startSpan(name, options) and friends). I'm not sure how span nesting is intended to work, but the toy examples do not flow at all like real code, and for that matter I can't even get the parentSpanID to be set by writing nested code that just calls startActiveSpan().
So is it possible to set a default attribute for a span? And how do you get parentSpanId !== undefined?
I agree that docs are by far not perfect - in special regarding Context/ContextManager. Everyone is welcome to help in improving :o).
There's a relationship in the docs between context and spans, and between spans, that I haven't seen demonstrated in any of the code I've written so far.
maybe this helps
So is it possible to set a default attribute for a span?
You can't set default attributes as spans are intended to describe a single operation. Therefore having the same data on all spans would be quite redundant. Maybe you should look into resource which is essentially a set of Attributes set once at startup for a process and exported with each span.
And how do you get parentSpanId !== undefined?
Via context. Simplest way is to do this manually:
import { trace, context, ROOT_CONTEXT } from "@opentelemetry/api";
const spanA = tracer.startSpan("spanA");
const ctxHoldingSpanA = trace.setSpan(ROOT_CONTEXT, spanA);
const spanB = tracer.startSpan("spanB", {}, ctxHoldingSpanA);
This is getting cumbersome/difficutl in a larger codebase as it requires to pass around the "current"/"active" span (or the context referring to it) manually. In special if you use auto instrumentations this no longer works that direct - a ContextManager is need.
The ContextManager takes care of tracking the active context across async boundaries. For Node.js it's based on AsyncLocalStorage.
It's up to the user to install a context manager, either direct via api.context.setGlobalContextManager() or by using e.g. NodeSDKor NodeTracerProvider which does this for you (BasicTracerProvicer doesn't install one on default).
With a context manager in place also code like this works:
import { trace, context, ROOT_CONTEXT } from "@opentelemetry/api";
const spanA = tracer.startSpan("spanA");
const ctxHoldingSpanA = trace.setSpan(ROOT_CONTEXT, spanA);
context.with(ctxHoldingSpanA, () => { // this sets the ctxHoldingSpanA as active for the callback
const spanB = tracer.startSpan("spanB"); // this picks up the context by calling `context.active()`
process.nextTick(() => {
const spanC = tracer.startSpan("spanC"); // even this works as ContextManager propagates the context across the async nextTick.
});
});
const spanD = tracer.startSpan("spanD"); // this span will be no child of spanA as context is not active here.
You may ask why using such a complex construct like context just to specify a parent span. Reason is that context can be used for more than just a span. You can set any sort of data and context manager takes care of propagation. One use case is baggage but you might use it for other stuff also.
You can't set default attributes as spans are intended to describe a single operation.
The great gaping problem with that is that I see no evidence of a span being logged out until it finishes, which means that a nested span is logged with absolutely no context for it. And any bug in the termination of the original span (eg, circuit breakers) means that data may never arrive. I thought the whole idea of spans was that operations are composed of other operations and we are trying to track why, for instance, we are seeing 10x as much traffic as we expect for a service and it turns out that something dumb is happening in the comment save logic, causing read amplification when {your session expired, you're an admin, you followed a link from Google}. This reads like, "well, you can do that if you want, but that's not what it's for."
Thanks for the example, that will probably help a bit, but it does indicate that the API as designed leaves a great deal of bookkeeping to the user, particularly for brownfield projects where any existing telemetry data was not modeled as callbacks/closures/Communicating Sequential Processes. There's gonna be a lot of victim blaming going on when I introduce this. If. I introduce this.
I have some code that does start/finally/stop (I have since added an alternative that takes a closure, but that's a lot of code to rewire). Right now I don't see any way to shoehorn into this callback based approach. And certainly not efficiently.
https://opentelemetry.io/docs/instrumentation/js/instrumentation/#get-the-current-span in the page you linked refers to a function that does not exist.
The ContextManager takes care of tracking the active context across async boundaries. For Node.js it's based on AsyncLocalStorage.
It's a bit difficult to step through this code in Jetbrains and figure things out. I'm only seeing the NOOP implementation, which probably explains some of the behavior I'm getting. The async_hooks implementation is a separate package. So for Node.js it's based on nothing, unless you include an extra package that handles it.
The great gaping problem with that is that I see no evidence of a span being logged out until it finishes, which means that a nested span is logged with absolutely no context for it.
That's true for BatchSpanProcessor and SimpleSpanProcessor part of @opentelemetry/sdk-trace-base. The call the exporter only for finished spans. SimpleSpanProcessor immediately for each finished span, BatchSpanProcessor either after some timeout or if a configurable amount of finished spans is present.
SimpleSpanProcessor is clearly a bad choice for production as doing a HTTP call for each span results in ridiculous overhead. But it servers a good job during dev for exporting on console.
As far as I know the OTLP protocol is only able to transport finished spans and this fits well to backends like zipkin/jaeger. Don't know if these backends woule even support to show non finished spans.
But if you use a different backend to few your traces you can implement your own SpanProcessor which exports also in progess spans and clearly you need an exporter + backend which uses a protocol that is able to transport this info and display it accordingly.
If it is just about logging to console it should be not that hard to use SimpleSpanProcessor and ConsoleSpanExporter as base and put something into onStart.
It's a bit difficult to step through this code in Jetbrains and figure things out. I'm only seeing the NOOP implementation, which probably explains some of the behavior I'm getting
Your initial post doesn't show how you setup SDK so can't give an exact answer here. But as you depend only on @opentelemetry/sdk-trace-base which is common code for browser and node it's clear that it doesn't hold any node.js specifics.
The simplest step is to move to @opentelemetry/sdk-trace-node which extends sdk-trace-base by node.js specifics - like installing an context manager matching to the node.js version you use.
https://opentelemetry.io/docs/instrumentation/js/instrumentation/#get-the-current-span in the page you linked refers to a function that does not exist.
it was added in API 1.2.0 but it seems you are still on 1.1.0. It's actually just a convenience function for trace.getSpan(context.active()).
Removing the bug label since this seems to be a misunderstanding of the API more than a bug in the implementation
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
This issue was closed because it has been stale for 14 days with no activity.