tracing icon indicating copy to clipboard operation
tracing copied to clipboard

Parent/child correlation lost in trace when adding Baggage with reactor and otel-bridge

Open ygiros opened this issue 11 months ago • 5 comments

Hi 👋,

I'm currently testing Micrometer Tracing + OTEL bridge on a Spring Webflux Application and I notice something unexpected when I use context propagation.

To resume I have the following running example :

Image

  • User requests reactive-front app to compute some square value
  • reactive-front app is litteraly "just a front", a delegates compute task to reactive-delegate app (2 apps to test context propagation with HTTP)

I use Jaeger Tracing Platform to inspect generated traces.

When I use Observation API without any context propagation, spans in generated trace have correct parent <-> child correlation

Image

But then I follow documentation to configure context propagation

// observationRegistry auto-configured from ObservationAutoConfiguration
// otelTracer auto-configured from OpenTelemetryTracingAutoConfiguration

this.observationRegistry.observationConfig().observationHandler(new DefaultTracingObservationHandler(otelTracer));

ObservationAwareSpanThreadLocalAccessor observationAwareSpanThreadLocalAccessor = new ObservationAwareSpanThreadLocalAccessor(observationRegistry, otelTracer);
ObservationAwareBaggageThreadLocalAccessor observationAwareBaggageThreadLocalAccessor = new ObservationAwareBaggageThreadLocalAccessor(observationRegistry, otelTracer);
ContextRegistry.getInstance()
        .loadThreadLocalAccessors()
        .registerThreadLocalAccessor(observationAwareSpanThreadLocalAccessor)
        .registerThreadLocalAccessor(observationAwareBaggageThreadLocalAccessor);
Hooks.enableAutomaticContextPropagation();

And I append some Baggage in my reactor sequence

return Mono.fromSupplier(() -> value)
    .doOnNext(aDouble -> {
        log.info("Receive request to calculate square of {}", aDouble);
        log.info("Current Baggage = {}", otelTracer.getAllBaggage());
    })
    .flatMap(this::computeSquare)
    .doOnNext(squareValue -> log.info("Respond result = {} to client", squareValue))
    .map(ResponseEntity::ok)
    // Name sequence (= name generated span)
    .name("getSquare-method")
    // Bind key/value pair to sequence (= set attribute to current span)
    .tag("value.from.request", value.toString())
    // Declare observation on sequence (= generate span)
    .tap(Micrometer.observation(observationRegistry))
    // Appends Baggage - appends here because of https://github.com/micrometer-metrics/tracing/issues/561
    // Didn't find the explanation of why it must be declared at the end ?
    .contextWrite(ReactorBaggage.append("baggage.value.from.request", String.valueOf(value)));

And now generated trace is "flat", meaning all my spans are root without parent

Image

I don't really understand why I have this result. Am I doing something wrong ?

PS : I've tried to reproduce it with unit tests, but I have some difficulties to use micrometer-tracing-test for now 😶


Technical Stack

  • Spring Boot 3.4.2
  • Micrometer Tracing 1.4.2

=> Full example project is available here : https://github.com/ygiros/micrometer-tracing-reactor

ygiros avatar Feb 10 '25 08:02 ygiros

@chemicL I wonder if .tap(Micrometer.observation(observationRegistry)) isn't a problem here. Any guesses?

marcingrzejszczak avatar Feb 10 '25 12:02 marcingrzejszczak

@marcingrzejszczak

But then I follow documentation to configure context propagation

Is this necessary on a Spring Boot application? Shouldn't this tutorial be meant for manual configuration? I imagine Spring Boot does its job to configure what's needed. I'd say first let's understand the setup and what is non-standard aside from the tap operator use:

// Declare observation on sequence (= generate span)
.tap(Micrometer.observation(observationRegistry))

I'm not sure if this is necessary since you already are adding an observation to the chain and the framework takes care of creating child spans in downstream services.

This is also something that might be worth looking at before wondering about reactor:

// Appends Baggage - appends here because of https://github.com/micrometer-metrics/tracing/issues/561
// Didn't find the explanation of why it must be declared at the end ?
.contextWrite(ReactorBaggage.append("baggage.value.from.request", String.valueOf(value)));

@marcingrzejszczak can you comment on this?

chemicL avatar Feb 10 '25 12:02 chemicL

Is this necessary on a Spring Boot application? Shouldn't this tutorial be meant for manual configuration? I imagine Spring Boot does its job to configure what's needed. I'd say first let's understand the setup and what is non-standard aside from the tap operator use:

Yeah Boot should ensure that we have a parent observation from the HTTP exchange filter function AFAIR.

I'm not sure if this is necessary since you already are adding an observation to the chain and the framework takes care of creating child spans in downstream services.

Yeah I don't understand what this is about.

@marcingrzejszczak can you comment on this?

Wasn't it the case that we wanted to have baggage appended at the very beginning of the flux? That's why this is done at the end?

marcingrzejszczak avatar Feb 10 '25 12:02 marcingrzejszczak

I don't understand the details here, but I think this comment and the associated commit might be helpful. It feels that the case presented here is overriding some defaults and leads to issues.

chemicL avatar Feb 10 '25 13:02 chemicL

Hi,

Thanks for your insights.

Is this necessary on a Spring Boot application? Shouldn't this tutorial be meant for manual configuration?

Yes sorry I didn't explain why I'm using a Spring Boot application in the first place. I'm looking to be as close as possible as an actual industrial application I'm currently working on, and it's Spring Boot based. We migrated from Spring Boot 2 to 3 last year and now we are looking to migrate from OpenTracing to OpenTelemetry. We are currently using the Datadog tracing library and we want to explore neutral-vendor solutions in our migration to OpenTelemetry.

I'm not sure if this is necessary since you already are adding an observation to the chain and the framework takes care of creating child spans in downstream services.

I want to combine zero-code instrumentation with manual instrumentation because it's our current use case. This is why I explicitly declare additional observations here. But yes it's not really useful to this specific example.

I think this comment and the associated commit might be helpful. It feels that the case presented here is overriding some defaults and leads to issues.

I've looked into it and I'm not completely sure how it might help ? I understand that I should disabled Spring Boot auto-configuration and make my own ?

ygiros avatar Feb 11 '25 10:02 ygiros

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

github-actions[bot] avatar Feb 19 '25 02:02 github-actions[bot]

Hi,

I didn't get an answer for my previous question :

I understand that I should disabled Spring Boot auto-configuration and make my own ?

ygiros avatar Feb 19 '25 08:02 ygiros

I understand that I should disabled Spring Boot auto-configuration and make my own ?

@marcingrzejszczak can you provide some guidance (maybe the docs can be improved to give suggestions about the Boot lifecycle, too)? I would say rather the opposite - aim to use the defaults, stack the libraries and rely on their integration rather than circumventing them.

I am wondering whether @ygiros is not hitting the same problem as described in https://github.com/spring-projects/spring-boot/issues/34510 ?

Some comments I have:

  • https://github.com/ygiros/micrometer-tracing-reactor/commit/0818de4629ae269fdf8910d040031af3fc42c4ef#diff-574bf05d7fff29ee56e96c1cdb0965fb1701bd4106c4438152e7bcf1dfd747d0R47 - you should either register a bean for this or have an ObservationRegistryCustomizer. Have a look at Spring Boot docs:

Beans of type ObservationPredicate, GlobalObservationConvention, ObservationFilter and ObservationHandler will be automatically registered on the ObservationRegistry. You can additionally register any number of ObservationRegistryCustomizer beans to further configure the registry.

  • https://github.com/ygiros/micrometer-tracing-reactor/commit/0818de4629ae269fdf8910d040031af3fc42c4ef#diff-574bf05d7fff29ee56e96c1cdb0965fb1701bd4106c4438152e7bcf1dfd747d0R52 - you should definitely not call this, this is done by Spring Boot for you so effectively you're adding the standard handlers twice AFAIU
  • https://github.com/ygiros/micrometer-tracing-reactor/commit/0818de4629ae269fdf8910d040031af3fc42c4ef#diff-574bf05d7fff29ee56e96c1cdb0965fb1701bd4106c4438152e7bcf1dfd747d0R55 - you already enabled it in https://github.com/ygiros/micrometer-tracing-reactor/commit/0818de4629ae269fdf8910d040031af3fc42c4ef#diff-9c206c3592f04a7fe75dc6b200dc1f84eaad899812af7d2caaf4cd7b5a14bd9bR3

chemicL avatar Feb 19 '25 08:02 chemicL

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

github-actions[bot] avatar Feb 27 '25 02:02 github-actions[bot]

@chemicL thanks for your comments, I'll look into it.

ygiros avatar Mar 03 '25 08:03 ygiros

@chemicL I applied corrections with your previous comments on a dedicated branch : https://github.com/ygiros/micrometer-tracing-reactor/pull/2

With informations described in https://github.com/spring-projects/spring-boot/issues/34510 I found the ReactorNettyTracingObservationHandler implementation and I used it instead of DefaultTracingObservationHandler ➡️ this modification fixes my trace ; now I have context propagation and correct parent ↔️ child correlations with spans 👍

However I still have an issue on how to add key/value in Baggage in some cases...

  1. I can use .contextWrite(ReactorBaggage.append("key", "value")) in my reactor sequence, but it's working only if I use it at the end (https://github.com/micrometer-metrics/tracing/issues/561) ; I'm still confused on why it's not working if we use it elsewhere in the sequence ?
  2. Using .contextWrite(ReactorBaggage.append("key", "value")) implies that "value" is hardcoded or from a variable declared before the sequence ; How can I add key/value in Baggage from the result of a previous reactor operator ?

Example

Mono.fromSupplier(() -> 2.0)
	// Computing square of 2
	.map(value -> value * value)
	.doOnNext(squareValue -> 
		// How to add "squareValue:4.0" to Baggage here ?
	)
	//...

ygiros avatar Mar 07 '25 08:03 ygiros

Glad it worked. The questions that you ask are rather fundamental and have to do with two concepts.

  1. Assembly time - the context is attached at reactive chain's assembly time and it seems you'd like the contents be filled at runtime which happens at a different phase. You can do something like this to combine assembly with runtime - you can assemble an inner chain at runtime of the outer chain:
Mono.fromSupplier(() -> 2.0)
	// Computing square of 2
	.map(value -> value * value)
	.flatMap(squareValue -> 
		// Add "squareValue:4.0" to Baggage here:
		Mono.just(squareValue)
				// do some processing in the context of this baggage, e.g. call WebClient
				.contextWrite(ReactorBaggage.append("key", squareValue))
	)
	//...
  1. Onion-like wrapping of operators. It's purely a functional programming limitation. Imagine you do something like f(g(x)) and attach context to function g. There is no technical way for f to look into it. Therefore if your computation needs to have some context it needs to be applied at the level of f. This translates to the declarative operators chaining with reactor, imagine f and g being operators and x being the source: x.g().f().
  • x.g().withContext(C).f(): g can look into C, f can not.
  • x.g().f().withContext(C) both g and f can look into C.

chemicL avatar Mar 07 '25 13:03 chemicL

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

github-actions[bot] avatar Mar 15 '25 02:03 github-actions[bot]

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open.

github-actions[bot] avatar Mar 22 '25 02:03 github-actions[bot]

Please let me know if you want us to reopen it.

jonatan-ivanov avatar Mar 25 '25 04:03 jonatan-ivanov

Thank you for all the answers, I think I have what I need for now so no need to reopen the issue 🙂

ygiros avatar Mar 25 '25 08:03 ygiros