zio-telemetry icon indicating copy to clipboard operation
zio-telemetry copied to clipboard

Spans Created from zio-telemetry and auto instrumentation are not connected

Open grogdotcom opened this issue 3 years ago • 1 comments

My company uses mostly a Java stack and uses opentelemetry auto instrumentation for the majority of its services however I work on a ZIO service. We would also like to leverage auto instrumentation. We are able to get some connected spans through auto instrumentation but we have noticed when the root span is created with zio-telemetry and any downstream span is created with auto instrumentation, the spans will not be connected.

I see #207 solves this issue when the auto instrumented spans are from non-zio code but when the auto instrumented spans are from ZIO code this solution no longer works. As an example, lets say I have a non-zio database library that is wrapped in a ZIO library, so all my calls to the database are ZIOs. If I create a new span and call into this ZIO database wrapper library, the spans in jaeger will be separate.

What is interesting however is that when my service receives a grpc call that downstream calls a ZIO wrapped database library, all of the auto instrumented spans are connected. However in this case zio-telemetry is never used.

My guess on what is going on is that the active span set in the ZIO environment does not translate to the underlying Context for auto instrumentation.

grogdotcom avatar Aug 06 '22 11:08 grogdotcom

Hey @grogdotcom,

My PR adds an experimental support for context propagation. It's merged and there is a snapshot version available. AFAIK a release is planned soon as well.

Could you let me know about your experience, if you decide to take it for a spin, please?

dmytr avatar Oct 10 '22 19:10 dmytr

Issue still persists. Here is an example repo highlighting the issue https://github.com/jeejeeone/zio-telemetry-poc2 . After adding opentelemetry agent traces defined in zio code are separate and autoinstrumented traces are also separate. Sadly I don't have time to make the example more easily runnable but you get the gist.

jeejeeone avatar Nov 18 '22 13:11 jeejeeone

Hi @jeejeeone,

I think the problem is that in this example a new Jaeger tracer is created with JaegerTracer.live:

    Server.serve(app).provide(Server.default, Tracing.propagating, JaegerTracer.live)

(Link: https://github.com/jeejeeone/zio-telemetry-poc2/blob/main/src/main/scala/com/example/Main.scala#L62)

I've tried using the tracer provided by the automatic instrumentation and all traces seem to be correctly connected. To do this:

    Server.serve(app).provide(Server.default, Tracing.propagating, ZLayer.succeed(GlobalOpenTelemetry.getTracer("test")))

Also this example app is a bit too simplistic. There is a very high chance the fiber will be executed on the same thread, so even if context propagation didn't work, traces still would be connected. There should be something that would suspend the fiber and, hopefully, move it to another thread when it's resumed. Maybe adding some sleep could cause this.

dmytr avatar Nov 18 '22 20:11 dmytr

Thanks for the feedback @dmytr , makes sense!

jeejeeone avatar Nov 19 '22 12:11 jeejeeone

@jeejeeone This one could be closed, right?

grouzen avatar May 28 '23 20:05 grouzen

@grouzen Yes 👍

jeejeeone avatar May 29 '23 03:05 jeejeeone