aws-otel-collector
aws-otel-collector copied to clipboard
Support the tail sampling processor from opentelemetry collector
When trying to use the tail_sampling processor, I found it was not included as one of the processors defined in AWS ADOT
Describe the solution you'd like Add the tailsamplingcollector package and add it to the processor map.
Describe alternatives you've considered I tried creating a PR, but it was closed. I have used the branch from the PR to self host a forked docker image.
Effectively for my needs, I need to be able to have the collector drop spans such as health check endpoints, which can be accomplished with the tail sampling processor. But there are other uses as well
Additional context My pull request
We would like to use the tail_sampling
processor too. In our use case we would like to limit to a specific rate the amount of successful spans.
processors:
tail_sampling:
decision_wait: 30s
policies:
[
{
name: sample_few_successful_spans,
type: and,
and: {
and_sub_policy:
[
{
name: sample_successful_spans,
type: status_code,
status_code: { status_codes: [ OK ] }
},
{
name: sample_few_spans,
type: rate_limiting,
rate_limiting: { spans_per_second: 15 }
}
]
}
}
]
We're looking to add ADOT to our observability efforts but tail sampling is a must have. +1 from us
Thanks for your feedback @zksward! ADOT PM here. Can you elaborate why it's a "must have" please?
It allows us to estimate and control costs while having the flexibility to specify traces of interest. We have a high throughput distributed application and one of the main concerns and pushback on implementing a system wide observability solution is managing the volume of the data. Are there alternatives to the behavior of which I'm not aware?
Thanks, again. It's not on our immediate roadmap but things can always change.
Are there alternatives to the behavior of which I'm not aware?
You can use what the respective SDK offers, in your app.
Thanks for being open to the feedback! I'm excited for the possibilities ADOT brings.
You can use what the respective SDK offers, in your app.
It is really hard to manage in the app once you have to sync with many services pushing traces to the collector Making the sampling in the collector, it can be guaranteed
Any update if it is planned for the next release in August/September?
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
I am also looking for the tail sampling to be part of ADOT
I would like to have a strategy that: (1) Capture all traces that failed or had high latency and send to the backend (2) Capture 5% of all traces regardless of their completion status
I need the failed traces for RCA
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
You can use what the respective SDK offers, in your app.
Please correct me if I'm wrong, but the SDKs only have an ability to sample a) at the span level and b) decided at the start of a span. I.e. by ratios or attributes given when the span is constructed. It cannot be used to filter whole Traces nor based on attributes known at the completion of a request/trace such as http-status or total duration. That is what the tailsamplingprocessor
does for you, and why it must be in the collector.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
Support for the Tail Sampling processor was made available today through ADOT collector v0.29.0.
For more details please refer to: https://aws-otel.github.io/docs/ReleaseBlogs/aws-distro-for-opentelemetry-collector-v0.29.0
Closing the ticket as the component is ready for use.