opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Instrumentation library info and Resources are not available to Samplers

Open Oberon00 opened this issue 3 years ago • 25 comments

ShouldSample receives as input all the known information about the Span to be created, see https://github.com/open-telemetry/opentelemetry-specification/blob/v1.1.0/specification/trace/sdk.md#shouldsample, except the instrumentation library info and the Resources.

Suppressing unwanted spans depending on the instrumentation library name was one of the original motivations of introducing it in the first place, see https://github.com/open-telemetry/oteps/blob/main/text/0016-named-tracers.md#solution

[...] a Sampler could be implemented that discards Spans or Metrics from certain libraries

Adding a backwards-compatible overload of ShouldSample that accepts these additional arguments (or an object with these additional properties) and providing a default implementation that forwards to the currently specified implementation should be possible at least in C++, Java, Python.

Oberon00 avatar Mar 30 '21 13:03 Oberon00

Discussed during the Friday issue triage meeting, this is a nice-to-have but not high-priority issue.

@anuraaga could you help on this? @Oberon00 would you be willing to take it?

reyang avatar Apr 02 '21 15:04 reyang

Sorry I did not react on https://github.com/open-telemetry/opentelemetry-specification/pull/1658#issuecomment-847958368

@bogdandrutu (and @iNikem)

@Oberon00 let's split this into 2 PRs to allow progress:

  1. that adds the InstrumentationLibrary which we've already agreed to support sampling by instrumentation library, then when we find a very good use-case for Resource we can discuss, before that point as pointed in the issue an implementation can be done anyway.

  2. The second PR will be when we have good use-case for Resource and prove that the alternative does not work.

For the first point, I still don't know now if we want to support having a sampler attached to multiple Tracers/TracerProviders or if we want to go the route of the SamplerProviderFactory.

Also, since we probably want to keep the times we change the stable sampler interface at a minimum, I don't think we should split the PR here, but decide now whether we ever want resources to be available to the sampler, and if so, in what way.

There is also a very similar issue open for how to pass resources to exporters here (#508). In summary, it currently feels like we have too many open questions in that area to proceed before we solve them in a way that gives us a consistent result. I see this cluster of issues:

  • This issue here (#1588, #1658)
  • open-telemetry/opentelemetry-ruby#508 "Change Processor and Exporter interface to pass resource"
  • #1690 "Clarify that Sampler can be shared among TracerProvider"
  • Maybe #1330 "Make the Resource part of API"
  • and also in the future open-telemetry/opentelemetry-ruby-contrib#44 "Adding resource attributes post-creation (e.g. via auto-discovery)" which might (or might not) bring us a new data structure that should be available to samplers.

Oberon00 avatar Jun 02 '21 09:06 Oberon00

I agree that there are a lot of open questions about Resources. But I also agree with Bogdan that making InstrumentationLibrary available to Samplers is not controversial, is useful and should be done now.

iNikem avatar Jun 07 '21 09:06 iNikem

Here's a sampler that needs Resource, for now passing in as a constructor but "keep in sync with SDK instance" is an unfortunate aspect.

https://github.com/open-telemetry/opentelemetry-java/pull/3343/files#diff-84d20f297fb0d8a8618975d3143dbd00a890f9988c8a7c25db22ec6ab15645d5R48

anuraaga avatar Jun 24 '21 06:06 anuraaga

@Oberon00 I want to have InstrumentationLibrary available to Samplers by doing https://github.com/open-telemetry/opentelemetry-specification/pull/1658#issuecomment-847958368. Are you Ok with me picking the corresponding part of that PR of yours and make a new PR about InstrumentationLibrary only? Or do you want to do it yourself? I don't want to touch Resources right now :)

iNikem avatar Aug 04 '21 09:08 iNikem

@iNikem I have no problem with someone else doing the PR, so please go ahead 😃👍

I'm a bit worried about not including resources though. This will probably mean that we will never get around to it (and would need to make not-easily-compatible changes twice if we did).

Oberon00 avatar Aug 04 '21 09:08 Oberon00

Just wondering if there's anything innately complicated about Resource that isn't for InstrumentationLibrary? If there isn't really any difference in complexity, given that the update will require backwards compatibility shims / default methods, isn't there benefit to doing them both at the same time to reduce API churn?

For use case, the X-Ray sampler requires the resource to allow sampling rules based on service type or ARN. I currently use this tedious pattern which is already basically "deprecated on launch".

https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/aws-xray/src/main/java/io/opentelemetry/contrib/awsxray/AwsXrayRemoteSampler.java#L64

anuraaga avatar Aug 05 '21 07:08 anuraaga

Just wondering if there's anything innately complicated about Resource that isn't for InstrumentationLibrary?

No, but the last PR was declined because people wanted to have a more complicated solution which the PR author (me) was not really interested in, so it got nowhere, see the linked #1658.

Oberon00 avatar Aug 05 '21 07:08 Oberon00

I'd be happy to reopen #1658 as-is if we now have found a good use case, if @iNikem agrees? But probably you had reasons for just proposing Resources.

Oberon00 avatar Aug 05 '21 07:08 Oberon00

@Oberon00 So to judge from the answer "No", does it mean we can just take the approach in #1850 and copy-paste it for Resource if it is the same complexity? :)

anuraaga avatar Aug 05 '21 07:08 anuraaga

@anuraaga This was demonstrated in #1658 (which is really just #1850 + one more line for resources). It is technically possible but was declined.

Oberon00 avatar Aug 05 '21 08:08 Oberon00

I'd be happy to reopen #1658 as-is if we now have found a good use case, if @iNikem agrees? But probably you had reasons for just proposing Resources.

My only objection to reopening 1658 instead of 1850 is the fear that it will be dragged for a long time again. I see a clear benefit in handling InstrumentationLibrary and Resource separately.

iNikem avatar Aug 05 '21 09:08 iNikem

Now that #1850 is merged, I reopened #1658 and updated it to only cover Resources.

Oberon00 avatar Aug 10 '21 14:08 Oberon00

During Maintainers meeting on 20-09-2021 @jmacd expressed his opinion that we should move to configuration-time decisions of configuring SDK and/or Tracers with the correct Samplers: all decisions based on Resources and InstrumentationLibraries can be done before providing Tracers to the users. Thus those Tracers can already use Samplers chosen/narrowed down based on that information. Thus they don't need to accept it in ShouldSample method.

Several attendees, myself including, agreed that this is a reasonable idea. It was thus decided to revert introducing InstrumentationLibrary to ShouldSample method and get back to the drawing board to choose between two approaches: either pass both Resources and InstrumentationLibrary to ShouldSample in one spec release OR don't pass neither of them and allow SDKs to configure specific Sampler for a specific Tracer.

iNikem avatar Sep 22 '21 08:09 iNikem

all decisions based on Resources and InstrumentationLibraries can be done before providing Tracers to the users.

Doesn't this disallow tweaking sampling rate based on resource information since the decisions would already have been completed? Cranking up the sampling rate based on resource without restarting a server is very important functionality when debugging specific services in a system and controlling sampling with a centralized config.

Though I understand that we "get back to the drawing board to choose between two approaches" so I'm guessing that means this isn't actually a done thing but just an idea so should be OK.

anuraaga avatar Sep 22 '21 08:09 anuraaga

But resources are immutable? How decisions can be changed based on immutable resources?

iNikem avatar Sep 22 '21 08:09 iNikem

If you have a sampling config such as

{
  service1: 0.01,
  service2: 0.05,
...
}

but service1 starts having a problem, you can usually redeploy the file and watch it to reconfigure the sampler without restarting your app

{
  service1: 0.50,
  service2: 0.05,
...
}

Resource indeed didn't change, but sampling rate did.

anuraaga avatar Sep 22 '21 09:09 anuraaga

@anuraaga that just requires sampler to support dynamic config (as jaeger samplers do), not to reinitialize sampler with a different resource.

yurishkuro avatar Sep 22 '21 14:09 yurishkuro

@yurishkuro Right the sampler doesn't need to be reinitialized - but it needs access to the Resource which is what this issue is titled.

all decisions based on Resources and InstrumentationLibraries can be done before providing Tracers to the users

The main point is that this isn't true for dynamic updates of sampling rate.

anuraaga avatar Sep 22 '21 14:09 anuraaga

It CAN have access to resource upon initialization, just not at ShouldSample() calls. Dynamic updates of sampling rate do not change anything in the Resource.

yurishkuro avatar Sep 22 '21 15:09 yurishkuro

Is there any progress on this issue? The span name argument is not quite useful if the instrumentation library cannot be determined.

sherwoodwang avatar Feb 17 '22 02:02 sherwoodwang

During Maintainers meeting on 20-09-2021 @jmacd expressed his opinion that we should move to configuration-time decisions of configuring SDK and/or Tracers with the correct Samplers: all decisions based on Resources and InstrumentationLibraries can be done before providing Tracers to the users. Thus those Tracers can already use Samplers chosen/narrowed down based on that information. Thus they don't need to accept it in ShouldSample method.

Several attendees, myself including, agreed that this is a reasonable idea. It was thus decided to revert introducing InstrumentationLibrary to ShouldSample method and get back to the drawing board to choose between two approaches: either pass both Resources and InstrumentationLibrary to ShouldSample in one spec release OR don't pass neither of them and allow SDKs to configure specific Sampler for a specific Tracer.

I like the idea that Samplers should be configured on a per-Tracer basis. This won’t require changes in OpenTelemetry API. It’s generally better to keep the interface of Sampler minimal. It’s also more flexible to provide information to user code as early as this piece of information becomes available than delaying it to the time to creating Spans.

sherwoodwang avatar Feb 17 '22 02:02 sherwoodwang

Looks like the API needs to be extended to either configure a new sampler on an existing Tracer or to pass the sampler to use to TracerProvider.getTracer().

Flarna avatar Feb 17 '22 08:02 Flarna

I need this feature as well in order to reduce the verbosity of noisy instrumentation libraries in a general and dynamic manner. At first, I was planning to add config options to various instrumentations which control the amount of spans being started, but after some discussion, I want to try to POC the sampler option. See discussion in ruby SDK for reference How can we move forward with this issue?

blumamir avatar Mar 08 '22 12:03 blumamir

Looks like the API needs to be extended to either configure a new sampler on an existing Tracer or to pass the sampler to use to TracerProvider.getTracer().

No, TracerProvider.get() has already taken arguments that indicate the instrumentation library. Changes solely in SDK will be sufficient to pass this information to sampler providers.

sherwoodwang avatar Apr 07 '22 13:04 sherwoodwang

allow SDKs to configure specific Sampler for a specific Tracer

Hey, guys. I don't quite understand what's being said here😳Is there a way to exclude span data now if we don't use Sampler#shouldSample?

Is there any progress on this issue? The span name argument is not quite useful if the instrumentation library cannot be determined.

Agreed, the span name may be duplicated between different instrumention library. For example, we barely pay attention the redis's QUERY operations, but the mysql's QUERY operations are very important. If I exclude the QUERY span name without InstrumentionInfo, all of them are excluded.This is unacceptable.

IcebergXTY avatar Jan 30 '23 02:01 IcebergXTY

My old OTEP was about supporting per-Tracer samplers. So those interested might want to look at it for a starting point: https://github.com/open-telemetry/oteps/pull/186

I've flip flopped on the usefulness of this approach so haven't tried carrying it forward.

tsloughter avatar Jan 31 '23 22:01 tsloughter

My old OTEP was about supporting per-Tracer samplers. So those interested might want to look at it for a starting point: open-telemetry/oteps#186

I've flip flopped on the usefulness of this approach so haven't tried carrying it forward.

May I know the approach you eventually take to resolve this issue?

I'm happy to raise another OTEP for this. But I'd like to limit the scope to Sampler only. Using different TracerProvider for different instrumentation scopes can always be a solution to this category of problems. In fact, we can always move any dynamic value from the construction of Tracer to the construction of TracerProvider and vice versa by creating many TracerProviders. However, that would render the instrumentation scope argument provided to TracerProvider.get() completely redundant. What makes a Sampler different is a Sampler needs to understand the semantics of span names, and the semantics of span names is inherently linked to their instrumentation scope. Sampler is very different in the sense that it depends on span names. Thus I would argue that a Sampler ought to be picked when a Tracer is constructed, rather than when a TracerProvider is constructed. If I raise an OTEP, I would touch only Sampler.

sherwoodwang avatar Mar 05 '23 14:03 sherwoodwang

@sherwoodwang I think sampler should just take the instrumentation library as an argument.

tsloughter avatar Mar 07 '23 00:03 tsloughter

@sherwoodwang I think sampler should just take the instrumentation library as an argument.

For my project, I would be satisfied with both approaches, either adding the dispatching logic to tracer providers, or adding arguments to samplers. According to the discussions above, Resource and InstrumentationScope need to be added at the same time in the same way. I'll try to create a OTEP. I haven't participated in the specification in the past. But I'll try.

sherwoodwang avatar Mar 11 '23 08:03 sherwoodwang