opentelemetry-ruby
opentelemetry-ruby copied to clipboard
Children of client spans
I've noticed that our spans now look like this:
data:image/s3,"s3://crabby-images/0e4f8/0e4f897ab8f45aaf9aa571db8d81e0066d8247ee" alt="Screen Shot 2021-06-24 at 9 20 06 AM"
Where the greenish boxes on the left are multiple calls to memcache that are children of a CLIENT span. The brown box is the server span of a different service, the greenish boxes on the right more caching related stuff.
We are using Faraday so what I imagine it happens is that the client span opens at the first middleware in Faraday so subsequent things that happen there specially the middleware dealing with caching would create spans that are children of this initial faraday span. And this is kind of neat, it helps understand exactly why those calls to cache were done.
The issue is that we do not know when exactly we called the other service, as you see in the screenshot, there is quite a lag between the last cache call and the other service responding, is that network or is that some crazy heavy faraday middleware, we can't tell.
So ideally there would be another span at the very last middleware (the faraday adaptor?) before calling out. But that would make a client span to be child of a client span which I do not even know if it is legal.
The issue is that we do not know when exactly we called the other service, as you see in the screenshot, there is quite a lag between the last cache call and the other service responding, is that network or is that some crazy heavy faraday middleware, we can't tell.
An interesting point. I think the difficulty here is that this approach would make me want a span for every faraday middleware, but that's probably not useful at all in most cases. This is something closer to what you'd want to use a profiler for, I think?
(Unless, of course, the thought is "the OTel Faraday middleware is the problem" 😄 )
But that would make a client span to be child of a client span which I do not even know if it is legal.
My understanding of the spec and related discussions at this point is that "it's not forbidden, but perhaps not the ideal thing." 😄
With spans for the initial and final middlewares in the Faraday pipeline, that's enough to see if there is an issue there. As you said, if there is an issue there probably you'd rather run some heavy tooling locally to figure things out.
Another idea here would be that the first Faraday span is not "CLIENT" but "INTERNAL" and only the span doing the outgoing call is of type "CLIENT" I think that would be closer to the spirit of the spec.
I will defer to others on this - I think it gets at "what should the faraday instrumentation actually be covering?"
Another option that might be interesting is to have the Faraday instrumentation add events to the span when it begins processing, and when it makes the network request somehow.
But that would make a client span to be child of a client span which I do not even know if it is legal
That is now officially allowed, although I don't think any existing backend does anything useful with the nested CLIENT spans.
I'm no expert in Faraday, but I believe you can manually add the Faraday tracing middleware wherever you want in the middleware chain, and the instrumentation will detect that and not install itself twice, so if the default position doesn't work for you, maybe try placing it manually?
👋 This issue has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep
label to hold stale off permanently, or do nothing. If you do nothing this issue will be closed eventually by the stale bot.