opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Add BeforeEnd to have a callback where the span is still writeable

Open gugahoa opened this issue 3 years ago • 52 comments

What are you trying to achieve? Set an attribute to all spans in a trace

What did you expect to see? Writeable spans on SpanProcessor's OnEnd(Span)

Additional context. What I tried to implement is a feature called trace field in Honeycomb (reference: AddFieldToTrace)

For that I first tried to use SpanProcessor, but soon found out that OnEnd(Span) is not writeable.

The solution I arrived was to write a custom exporter which was able to read from a thread-local storage (at the time of the implementation, Baggage had yet to be spec'ed out) to list all attributes related to a Trace and insert into the span before exporting.

From what I understood from the SpanProcessor specification, that's the place where I should have placed this logic, as it's a transformation that should happen before the span is sent to any exporter

gugahoa avatar Oct 13 '20 17:10 gugahoa

/cc @arminru

tsloughter avatar Oct 13 '20 21:10 tsloughter

Required for dynamic resource-like attributes desired in every span but computed at run-time because they're not constant, eg. instance uptime and recent CPU utilisation. Injecting them in the exporter gives results that become less accurate during the spikes we're most interested in.

garthk avatar Oct 13 '20 21:10 garthk

IIRC it was discussed to make spans writeable in OnEnd but this was left out because users will not always have full control over the order in which span processors are registered and in turn invoked, which could result in unreliable behavior. From your use case it sounds like you would not need any attributes that are added/overwritten in OnEnd visible to any other span processor, right? If you only need to have those changes reflected once passed to exporters, this shouldn't be an issue then.

@carlosalberto you seemed to remember some of that discussion as well in the meeting yesterday. Do you know the rationale for that decision?

arminru avatar Oct 14 '20 07:10 arminru

Why these attributes cannot be set in onStart, which already has writable span?

iNikem avatar Oct 14 '20 07:10 iNikem

Reading the spec again, I noticed that it explicitly says that spans are already marked as ended at the time they are passed to SpanProcessor.OnEnd:

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/sdk.md#onendspan

OnEnd is called after a span is ended (i.e., the end timestamp is already set).

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#end

End: Implementations SHOULD ignore all subsequent calls to End and any other Span methods, i.e. the Span becomes non-recording by being ended (there might be exceptions when Tracer is streaming events and has no mutable state associated with the Span).

We would also have to change either of those requirements if we wanted to make spans writeable in OnEnd and we would need to understand the consequences well.

arminru avatar Oct 14 '20 08:10 arminru

Exactly. The difficult to get right (but specified!) order (exporting may happen before the other OnEnd processor) is another problem, but the main problem is that spans are already ended, as @arminru said. This was already discussed in https://github.com/open-telemetry/opentelemetry-specification/pull/669#issuecomment-654841632.

There are two ways we could go forward here:

  • Use existing mechanisms in your implementation to modify the "SpanData", i.e. the readable representation of the span that the exporter gets, instead of the Span itself. EDIT: E.g. Java has https://github.com/open-telemetry/opentelemetry-java/blob/v0.9.1/sdk_extensions/tracing_incubator/src/main/java/io/opentelemetry/sdk/extensions/incubator/trace/data/SpanDataBuilder.java
  • We may introduce a new SpanProcessor hook BeforeEnd.

We definitely can't pass a writable span to OnEnd.

Oberon00 avatar Oct 14 '20 09:10 Oberon00

Why not define it as your Span Processor must assume it gets the Span in the form it is ended with and not expect to have any changes from previous processors visible?

tsloughter avatar Oct 14 '20 13:10 tsloughter

@iNikem my understanding is that it is because OnStart can't be used to implement the Honeycomb feature https://godoc.org/github.com/honeycombio/beeline-go#AddFieldToTrace -- it would only be able to add the attribute to spans started after this call, but it is supposed to be able to add them to all local spans in the trace, even if already started.

I'd be fine with a BeforeEnd span processor, but don't really like adding more hooks when as far as I can tell the only issue is to define that OnEnd processors can't depend on each other and that if you have multiple processors writing the same attribute you can't know which will actually be added last -- so a warning to use unique attribute keys.

tsloughter avatar Oct 14 '20 13:10 tsloughter

No, the main issue is that in OnEnd the span is already ended, and for an ended span, SetAttribute should be a no-op. Changing that seems liable to opening a can of worms.

Oberon00 avatar Oct 14 '20 13:10 Oberon00

Ah, ok. Yea, I wouldn't want to change the fact that ended means read-only but instead move the "ending" to after the processors. But this isn't possible since one of the processors is what takes the "ended" span and passes it to the exporter... I didn't like exporting being part of processors because of this.

And probably too late to argue that sending to the exporter should be separate and unrelated to processors, leaving OnEnd processors to act like "BeforeEnd" :(

tsloughter avatar Oct 14 '20 14:10 tsloughter

Maybe we would even need a special state where the span is semi-ended, i.e. it has IsEnded==true and has an end timestamp, but is still recording.

Oberon00 avatar Oct 14 '20 14:10 Oberon00

@iNikem my understanding is that it is because OnStart can't be used to implement the Honeycomb feature https://godoc.org/github.com/honeycombio/beeline-go#AddFieldToTrace -- it would only be able to add the attribute to spans started after this call, but it is supposed to be able to add them to all local spans in the trace, even if already started.

I don't understand. If you have a SpanProcessor registered, it will receive all spans created in the given process. And so it will be able to add attributes to all spans. You don't "do a call" manually. All spans are handed to you.

iNikem avatar Oct 14 '20 14:10 iNikem

@iNikem it only receives a write-able span for OnStart, meaning if spans are already started in the trace you will not be handed those spans at a point that you have the attributes and can write to the span.

tsloughter avatar Oct 14 '20 14:10 tsloughter

It seems I still don't understand your use-case: what are these attributes that are not available right away but become available only when some spans were already started?

And even more: how does onEnd differ from onStart in this regard? I fail to see how you hoped to achieve your goal with onEnd that you cannot do in onStart?

iNikem avatar Oct 14 '20 15:10 iNikem

It seems I still don't understand your use-case: what are these attributes that are not available right away but become available only when some spans were already started?

An HTTP user id parsed out of headers is the best example I can think of, but actual honeycomb users (@garthk @gugahoa ?) probably have more. The initial request span is started right away, before user code parses out headers they care about, a child span is started when the user's logic is called at which point they parse out the user-id and want to put it on all spans in that trace but the span created when the HTTP handler got the request has already run OnStart so the only time it could be added is OnEnd -- without manually going through the stack and adding the attribute.

tsloughter avatar Oct 14 '20 15:10 tsloughter

That's a good characterization, @tsloughter. This has been a "normal" way of using Honeycomb's Beeline library for a long time. It buffers the trace span tree in memory, and can distribute attributes (called "fields" there) up and down the span tree before sending the spans to the server. It stores a set of fields on the trace itself, in a field on the trace.Trace type called "traceLevelFields." You can see the transcription from that field down to each span just before sending each here.

seh avatar Oct 14 '20 15:10 seh

I am curios why do you need to still add events/attributes to the Span when you can do that directly on the SpanData that we will export? What is the advantage to change the Span vs modifying the data that will be exported directly?

bogdandrutu avatar Oct 14 '20 16:10 bogdandrutu

Agree with @bogdandrutu. I'm not sure which implemenation language you are using but in Java there is https://github.com/open-telemetry/opentelemetry-java/blob/v0.9.1/sdk_extensions/tracing_incubator/src/main/java/io/opentelemetry/sdk/extensions/incubator/trace/data/SpanDataBuilder.java

Unfortunately, #158 is not resolved so this is all a bit vague in the spec.

Oberon00 avatar Oct 14 '20 16:10 Oberon00

"that we will export"

Where is there a pluggable place to modify SpanData before export?

If you are saying OnEnd Processor can modify SpanData then that would be fine.

tsloughter avatar Oct 14 '20 16:10 tsloughter

If you are saying OnEnd Processor can modify SpanData then that would be fine.

Processor can certainly modify SpanData

iNikem avatar Oct 14 '20 16:10 iNikem

@tsloughter

Where is there a pluggable place to modify SpanData before export?

The exporter interface. You can make your own exporter that wraps an existing one and does the modification.

Oberon00 avatar Oct 14 '20 16:10 Oberon00

@iNikem

Processor can certainly modify SpanData

True, but in order for the "modification" (SpanData is immutable at least in Java, so it actually creates an updated copy) to be picked up by an exporter, that exporter must be invoked in the same OnEnd.

Oberon00 avatar Oct 14 '20 16:10 Oberon00

@bogdandrutu

I am curios why do you need to still add events/attributes to the Span when you can do that directly on the SpanData that we will export? What is the advantage to change the Span vs modifying the data that will be exported directly?

That would mean writing my own processor which will export to the configured exporters, but I wouldn't want to write my own processor for that and instead it would be preferable to rely on tried-and-true processors such as the BatchProcessor

@Oberon00

The exporter interface. You can make your own exporter that wraps an existing one and does the modification.

I'm not sure this translates well to Erlang, but I need to try it out first to see how it goes

gugahoa avatar Oct 14 '20 16:10 gugahoa

The exporter interface. You can make your own exporter that wraps an existing one and does the modification.

That is what we hoped not to have to do and I believe is already the way people have been doing it.

But I guess if its a wrapper that can take any exporter it and be a new exporter that acts on the attributes and then calls the wrapped exporter with the modified batch it wouldn't be too bad.

It would certainly be more efficient to update "on end", but it sounds like a wrapper exporter is the way to go -- assuming there aren't issues with the lifetime of the shared attributes on a trace, will have to talk to those using it.

tsloughter avatar Oct 14 '20 16:10 tsloughter

I'm not sure this translates well to Erlang, but I need to try it out first to see how it goes

It does. We have an exporter behaviour, otel_exporter. You can create a new exporter that simply goes through all the spans and instead of exporting them it updates their attributes in the table and then calls the export/2 function of the wrapped exporter.

And has to call the init/1 of the wrapped exporter when its init/1 is called at startup.

tsloughter avatar Oct 14 '20 16:10 tsloughter

It would certainly be more efficient to update "on end"

You seems to have an implicit assumption that for each span all its parent spans from the same trace will end after it. That assumption is wrong. Parent span can end even before child span starts.

iNikem avatar Oct 14 '20 17:10 iNikem

Parent span can end even before child span starts.

What does that represent? Perhaps an asynchronous task spawned by the parent that took a while to start?

seh avatar Oct 14 '20 17:10 seh

It would certainly be more efficient to update "on end"

You seems to have an implicit assumption that for each span all its parent spans from the same trace will end after it. That assumption is wrong. Parent span can end even before child span starts.

I know, this feature can't be perfect, there will be missed spans in some specific cases, but for many cases it appears to be useful and work just fine.

tsloughter avatar Oct 14 '20 17:10 tsloughter

Parent span can end even before child span starts.

What does that represent? Perhaps an asynchronous task spawned by the parent that took a while to start?

Async servlet/Spring webflux controller. Controller method will return and its span will finish before async result starts actual computing.

iNikem avatar Oct 14 '20 17:10 iNikem

If we're not up for changing the behaviour of OnEnd to BeforeEnd, but some users need something with BeforeEnd semantics, why not add a BeforeEnd?

The alternative seems to be to either lose our time correlation or build amusing workarounds like using OnEnd to create a child span with the data we couldn't add to this one, and wrapping the exporter we want in another one which first buffers spans for long enough to reliably find the disposable spans, copy attributes from them to their parent, and then discard them, and then calls the original exporter. It's possible, but it's a lot of work to force on our users.

garthk avatar Oct 14 '20 20:10 garthk