opentelemetry-collector
opentelemetry-collector copied to clipboard
Loss of traces while terminating a otel collector
Component(s)
receiver/otlp
What happened?
Describe the bug While performing a collector termination, we observed a loss of traces
Otlp receiver received 7604 spans. Exporter sent only 7504 spans. ~100-200 spans were lost during otel-collector termination.
Steps to reproduce
- Deploy otel collector (Receiver: otlp/grpc , exporter: opensearch)
- Send spans via OTLP gRPC.
- Graceful termination of open telemetry running process
- Observe otelcol_receiver_accepted_spans, otelcol_receiver_refused_spans, otelcol_exporter_sent_spans, otelcol_exporter_send_failed_spans
What did you expect to see? No loss of spans during a termination. All spans received by the collector should be received and exported
What did you see instead? There is a mismatch between spans received and exported during otel collector termination Some spans are lost and not ingested into the backend (Opensearch).
Collector version
0.128.0
Environment information
OS: SLES 15-SP6 Compiler: go 1.24.6
OpenTelemetry Collector configuration
exporters:
opensearch:
http:
endpoint: http://opensearch:9200
tls:
insecure: true
retry_on_failure:
enabled: true
max_elapsed_time: 0
sending_queue:
enabled: true
num_consumers: 10
queue_size: 1000
sizer: requests
storage: file_storage/opensearch
block_on_overflow: false
extensions:
file_storage/opensearch:
directory: /opt/collector/queue/opensearch
create_directory: true
health_check:
endpoint: 0.0.0.0:13133
processors:
batch: {}
batch/opensearch:
send_batch_size: 8192
send_batch_max_size: 8192
timeout: 200ms
batch/otlp:
send_batch_size: 8192
send_batch_max_size: 8192
timeout: 200ms
memory_limiter:
check_interval: 5s
limit_percentage: 85
spike_limit_percentage: 10
connectors:
forward/traces:
forward/logs:
receivers:
otlp/grpc-insecure:
protocols:
grpc:
endpoint: 0.0.0.0:4319
service:
extensions:
- health_check
- file_storage/opensearch
pipelines:
traces:
exporters:
- forward/traces
processors:
- memory_limiter
- batch
receivers:
- otlp/grpc-insecure
traces/opensearch:
exporters:
- opensearch
processors:
- memory_limiter
- batch/opensearch
receivers:
- forward/traces
Log output
Sent spans via simulator : ~12000
Receiver accepted: 7604
otelcol_receiver_accepted_spans{receiver=\"otlp/grpc-insecure\",transport=\"grpc\"} 7604
Receiver refused: 0
otelcol_receiver_refused_spans{receiver=\"otlp/grpc-insecure\"} 0
Exporter sent: 7504
otelcol_exporter_sent_spans{exporter=\"opensearch\"} 7504
Exporter failed: 0
otelcol_exporter_send_failed_spans{exporter=\"opensearch\"} 0
After termination of opentelemetry collector,
New instance of otel collector exported span count:
otelcol_exporter_sent_spans{exporter=\"opensearch\"} 4298
Total expected spans stored in opensearch - 12002
actual stored in backend - 11802 (old instance)7504 + 4298 (new instance)
esRest GET /jaeger-span-2025-09-12/_count
{"count":**11802**,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0}}
Additional context
The number of spans received by the exporter does not match the spans sent from the receiver, and this occurs for both OTLP gRPC and OTLP HTTP receivers.
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.