OpenTelemetry TraceIdRatioBased sampler requirements following OTEP 235
Fixes #1413.
Changes
Updates Trace SDK and TraceState handling specifications with OTEP 235 sampling thresholds. This PR depends on https://github.com/open-telemetry/opentelemetry-specification/pull/4162 to introduce the concept of Trace Randomness. This PR is the second part of two, it focuses on thresholds.
- Revise
TraceIdRatioBasedalgorithm section. The existing TODO implies this is not a breaking change. - Change text about
TraceIdRatioBasedconstruction - Move text about
TraceIdRatioBaseddescription (leave unmodified).
The content of OTEP 235 was revised for clarity by @kalyanaj in https://github.com/open-telemetry/oteps/pull/261. I've heavily copied from the final text in that still-unmerged OTEP. I introduced new content explaining how to compute thresholds from probabilities with use of variable precision, referring to the OTel Collector-Contrib pkg/sampling reference implementation. The new (Golang) demonstration code is validated here, https://go.dev/play/p/7eLM6FkuoA5.
A proof of concept for this specification along with #4162 can be found in https://github.com/open-telemetry/opentelemetry-go/pull/5645.
Part of #3602.
Product of the Sampling SIG members @kentquirk @kalyanaj @oertl @PeterF778 and myself.
- [x] Related issues also #3307, #2253, #2179, #2113, #1947,#1844
- [x] OTEP: https://github.com/open-telemetry/oteps/pull/235
- [x] Links to the prototypes: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/probabilisticsamplerprocessor, https://github.com/open-telemetry/opentelemetry-go/pull/5645
- [x]
CHANGELOG.md - [x]
spec-compliance-matrix.md
FWIW, it seems that this bug has been fixed in otel-collector v.1.32.0 🤔
Note that this appears to be fixed in public.ecr.aws/aws-observability/aws-otel-collector:v0.42.0 but I wanted to provide instructions for reproducing/testing the bug anyway.
To reproduce this issue in either public.ecr.aws/aws-observability/aws-otel-collector:v0.41.2 or public.ecr.aws/aws-observability/aws-otel-collector:v0.41.1 you can do the following:
# create an otel-agent-config.yaml file
cat > otel-agent-config.yaml <<EOF
services:
aws-ot-collector:
image: public.ecr.aws/aws-observability/aws-otel-collector:v0.41.2
command: ["--config=/etc/otel-agent-config.yaml"]
volumes:
- ./otel-agent-config.yaml:/etc/otel-agent-config.yaml
ports:
- 4318:4318
EOF
cat > docker-compose.yaml <<EOF
services:
aws-ot-collector:
image: public.ecr.aws/aws-observability/aws-otel-collector:v0.41.2
command: ["--config=/etc/otel-agent-config.yaml"]
volumes:
- ./otel-agent-config.yaml:/etc/otel-agent-config.yaml
ports:
- 4318:4318
EOF
# and run docker
docker compose up
Once the OTEL container is running:
curl http://localhost:4318
Even though this is not an actual metric send this will create the given bug.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue is not stale. The problem still exists.
@pawelkaliniakit - I believe that this issue has been resolved by newer versions of the aws-otel-collector - if this is true you might consider marking this issue closed.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue was closed because it has been marked as stale for 30 days with no activity.