opentelemetry-collector-contrib icon indicating copy to clipboard operation
opentelemetry-collector-contrib copied to clipboard

To achieve different sampling rates for different applications and integrate them with the OTel collectors

Open zendesk-shweta opened this issue 1 year ago • 7 comments

Component(s)

No response

Describe the issue you're reporting

How can we set up the different sampling rates for different applications and integrate them with the OTel collectors to have a centralized control over sampling rate on otel config side.? What are the different approaches to achieve this on otel side?

zendesk-shweta avatar Mar 05 '24 02:03 zendesk-shweta

Usually sampling rates for applications are determined by settings in the configured receivers. To choose different sampling rates for different applications, you'd want to check the configuration options for each receiver you're interested in using, and go from there.

Is that generally what you're wondering, or did I misunderstand?

crobert-1 avatar Mar 05 '24 18:03 crobert-1

Lets say i have 2 services running on the same cluster as otel collector and each service is sending the traces to otel collector, now our requirement is to set the different sampling rates for each service on otel.config file , can i define the sampling rates like this ? extensions: pprof: endpoint: :1888 zpages: endpoint: :55679

receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318

processors: batch: probabilistic_sampler/tracing: sampling_percentage: 5 rules: - service_name: "RUN-APP-OTEL-SERVICE" sampling_percentage: 10 - service_name: "service2" sampling_percentage: 20 memory_limiter: # 75% of maximum memory up to 2G limit_mib: 256 # 25% of limit up to 2G spike_limit_mib: 200 check_interval: 5s

exporters: logging: loglevel: debug debug: verbosity: detailed datadog: api: site: "datadoghq.com" key: ${env:DD_API_KEY} tls: insecure_skip_verify: true sending_queue: enabled: true queue_size: 200 num_consumers: 100 timeout: 1s retry_on_failure: enabled: false initial_interval: 5s max_interval: 30s max_elapsed_time: 5m

service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch, probabilistic_sampler/tracing] exporters: [debug, datadog, debug] # Change to datadog metrics: receivers: [otlp] processors: [memory_limiter, batch] exporters: [logging, datadog, debug] # Change to datadog

extensions: [pprof, zpages]

Or is there a better way to achieve this? As we may have large number of service sending the traces to otel collector and i am wondering how will we add the all the services under processors?

zendesk-shweta avatar Mar 06 '24 02:03 zendesk-shweta

My apologies @zendesk-shweta, for some reason I misinterpreted your question thinking you were asking about how often to scrape endpoints in a receiver, not the sampling rate in the probabilistic sampler 👍

I don't think it's possible in a single processor definition for this processor. You'd likely have to define entirely different receivers and processors, and then have a pipeline in the collector for each service. I suggest this solution as I don't think this processor filters based on attributes, so all data that it gets would be sampled at the same rate. To be able to sample two sets of data at a different rate, you'd need the data sets to be separately received and processed, to my understanding. The code owners would have a definitive answer though, I'm not very familiar with this component and I may be missing something here.

I'll mark this as an enhancement request.

crobert-1 avatar Mar 06 '24 16:03 crobert-1

Pinging code owners for processor/probabilisticsampler: @jpkrohling. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar Mar 06 '24 16:03 github-actions[bot]

/label processor/sampler help-wanted

zendesk-shweta avatar Mar 07 '24 05:03 zendesk-shweta

This is a reasonable request. See https://github.com/open-telemetry/oteps/pull/250. I expect this functionality will eventually emerge in the probabilisticsamplerprocessor, and that the OpAmp protocol will be used to distributed sampling configurations via an OTel sampling configuration, but there is a lot of work to do.

jmacd avatar Mar 21 '24 14:03 jmacd

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • processor/probabilisticsampler: @jpkrohling @jmacd

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] avatar May 21 '24 03:05 github-actions[bot]