[exporter/elasticsearch] Use permanent error if applicable
Component(s)
exporter/elasticsearch
Is your feature request related to a problem? Please describe.
There may be places where es exporter should use a permanent error consumererror.NewPermanent() and it does not.
Describe the solution you'd like
Update es exporter code to use permanent errors where applicable
Describe alternatives you've considered
No response
Additional context
No response
Pinging code owners:
- exporter/elasticsearch: @JaredTan95 @carsonip @lahsivjar
See Adding Labels via Comments if you do not have permissions to add labels yourself.
/label good-first-issue
I would like to attempt to make this enhancement. @carsonip please assign to me when you get a chance.
I don't have the permission to assign issues, but feel free to work on it.
Hey @carsonip, I have identified a few places where I think we can return a permanent error to reject the request due to document mapping or routing errors:
- The getRequestMappingMode method returns two errors when an invalid mapping is provided via the client metadata. This error is returned to push data for each signal (log example)
- The getScopeMappingMode method returns an error when an invalid mapping is returned via the scope attribute. This error also applies to each signal (log example).
- ~Lastly for metric data only, the push data method will return an error for the entire request when there is an error finding the index to route to~
- update: this error is only logged and not propagated to the client
The last two errors are interesting since they are at the scope level. If I understand the flow correctly we can partially process the earlier records, but then reject the entire request based on an invalid elastic.mapping.mode scope attribute. The collector retry logic is not clear to me (see below), but I think returning a permanent error should ensure this request is not retried.
I did some testing running a local collector on the first two cases:
- Error:
invalid context mapping mode
- I observed two log entries for the same request (so one retry attempt)
- When wrapping the error with
NewPermanent, I only so one log entry (no retry)
- Error
invalid scope mapping mode
- I observed one log entry for the same request (no retry(
- When wrapping the error with
NewPermanent, there was also no retry.
Let me know what you think.
one retry attempt
Q: When there are retries, are they retries from within the collector itself in earlier stages of the pipeline, i.e. receiver / processor, or from the client which sends the problematic request?
If it is the latter, how did the response change at the boundary? e.g. otlp / otlphttp response code
~~Hmm not exactly sure where the retries were happening I will need to get back to you on this one~~
Q: When there are retries, are they retries from within the collector itself in earlier stages of the pipeline, i.e. receiver / processor, or from the client which sends the problematic request?
The client (telemetrygen) retries by default. I confirmed by checking the collector internal metrics when we do not return a permanent error:
rpc_server_request_size_count{rpc_method="Export",rpc_service="opentelemetry.proto.collector.logs.v1.LogsService",rpc_system="grpc",service_instance_id="87f63512-cbb4-4f4c-b87a-fdb8cbb900ce",service_name="otelcontribcol",service_version="0.128.0-dev"} 2
If it is the latter, how did the response change at the boundary? e.g. otlp / otlphttp response code
I do see the oltp response changing though when using the telemetrygen client: telemetrygen logs --otlp-insecure 1 --otlp-header x-elastic-mapping-mode=\"invalid_mode\"
before returning the permanent error
2025-06-06T16:04:13.086-0700 FATAL logs/worker.go:81 exporter failed {"worker": 0, "error": "context deadline exceeded: rpc error: code = Unavailable desc = invalid context mapping mode: unsupported mapping mode \"invalid_mode\", expected one of [\"bodymap\" \"ecs\" \"none\" \"otel\" \"raw\"]"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/telemetrygen/pkg/logs.worker.simulateLogs
after returning the permanent error
2025-06-06T16:07:10.446-0700 FATAL logs/worker.go:81 exporter failed {"worker": 0, "error": "rpc error: code = Internal desc = Permanent error: invalid context mapping mode: unsupported mapping mode \"invalid_mode\", expected one of [\"bodymap\" \"ecs\" \"none\" \"otel\" \"raw\"]"}
github.com/open-telemetry/opentelemetry-collector-contrib/cmd/telemetrygen/pkg/logs.worker.simulateLogs
Thanks for the example. Lgtm.