Limited OTLP exporter request authentication methods
What are you trying to achieve?
We are trying to allow the OTLP exporter to be able to authenticate dynamically to OTLP endpoints that require more than static credentials like API keys in the environment.
What did you expect to see?
We expected to see a mechanism for users of OTel to be able to interact with the HTTP client used in the OTLP exporter and/or modify the HTTP headers as the requests are sent to an OTLP endpoint, however, neither of these mechanisms exists. This capability is missing in all supported OTel instrumentation languages we've looked into (Java, Python, JS, .NET).
Additional context
This discussion was originally triggered by requests to have AWS SigV4 authentication in the OTel Java instrumentation (open-telemetry/opentelemetry-java/issues/7002). In that discussion, we found the capabilities missing in Java are also missing in the other languages, that the issue is not limited to AWS SigV4 authentication, and that dynamic authorization mechanisms for data sent to OTLP endpoints were not supported by OTel instrumentations.
As such, we brought this issue up in the OpenTelemetry Sig: Specifications meeting on 28/01/2025 (Google Doc, meeting recording). In this meeting, we discussed the issue, finding that there is some community interest to address this issue, discussing some potential solutions, and gathering some potential first steps (namely, creating this ticket).
In that meeting, the following related issue previously raised in the golang instrumentation was brought up: open-telemetry/opentelemetry-go/issues/2632. This related issue describes a similar lacking capability in the instrumentation in that golang users are unable to specify the HTTP client used to export data to the OTLP endpoint. We can see in that issue there is interest from the community and other stakeholders like Azure (comment being referenced) to be able to accommodate more complex HTTP request manipulation, especially for the purposes of secure authentication.
One discussed potential solution that is of note for future reference is to introduce callback hooks in the exporters to allow for response-driven actions like re-authentication to the OTLP endpoint, e.g. make request → 401 response → callback hook triggers reauthentication.
Discussion on the same subject in the Go repository:
- https://github.com/open-telemetry/opentelemetry-go/issues/4536
- https://github.com/open-telemetry/opentelemetry-go/issues/5129
@trask, we don't have official support for AWS SigV4 authentication in Quarkus. Users would have to connect to aws following the examples in the docs: https://github.com/aws-samples/sigv4a-signing-examples/tree/main/java
Hi @brunobat This is helpful.
However the issue here is that the clients making the call to an OTLP endpoint is abstracted away within the exporters. So it's not possible for OTel users to modify/enhance the behavior of these clients to add a complex auth process like AWS SigV4 where the entire request, including the body, is needed to construct the signature.
Hi folks, to address this feature request I'm proposing the following approach. Would love to get feedbacks to improve the proposal and nail down something concrete that can be implemented in the SDKs. Tagging some folks that I think may be interested in this, but anyone is welcome to chime in. :) @trask @jack-berg @lmolkova @majanjua-amzn @mxiamxia
NOTE:
- The approach below is idiomatic to OpenTelemetry Python but can be generalized to fit other languages semantics.
- The proposal does not intend to provide specifications about the solution. The intent is to explain the proposal with some examples but the exact specification is yet to be discussed and formalized.
Objective
The objective is to add a callback hook while maintaining compatibility with existing SpanExporter implementation that allows users to provide their authentication mechanisms directly to the SDK for sending telemetry to a service endpoint that requires some sort of authentication such as OAuth or AWS SigV4.
Implementation Approach
-
Introduce an
authenticatorhook in theSpanExporter:- This should be a callable that receives the HTTP request data (headers, body, URL, etc.) and returns a dict of key-value pair(s) of authentication header back to the calling exporter.
- The idea is similar to the OpenTelemetry Java’s Authentication mechanism via a
Supplierimplementation but it does not work for complex auth processes like AWS SigV4 where the entire request is needed to compute the authentication header. - The request data passed to the
authenticatormust be immutable to prevent any malicious modification of the request.
- The idea is similar to the OpenTelemetry Java’s Authentication mechanism via a
- Users can provide this hook during the exporter builder phase to inject custom logic. The logic will be called when the exported exports each batch of spans to allow for refresh of auth token/credentials.
- This should be a callable that receives the HTTP request data (headers, body, URL, etc.) and returns a dict of key-value pair(s) of authentication header back to the calling exporter.
-
Modify export Method:
- Before sending the request, call the
authenticatorhook (if provided) and receive a dict of key-value pairs of headers. - The headers will be added to the prepared request and the request will be sent out.
- Before sending the request, call the
Sample UX for an OpenTelemetry User
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from otel_aws_sigv4 import sigv4_authenticator # suppose this is the auth module for AWS SigV4
# Create an OpenTelemetry OTLP exporter with AWS SigV4 signing enabled
exporter = OTLPSpanExporter(
endpoint="https://xray.us-east-1.amazonaws.com/v1/traces",
authenticator=sigv4_authenticator
)
# Set up OpenTelemetry with the SigV4-signed exporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.trace import get_tracer_provider
provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(exporter))
# Obtain a tracer and create spans
tracer = get_tracer_provider().get_tracer("otel-example")
with tracer.start_as_current_span("example-span"):
pass # Simulating a traced operation
Challenges
A few technical kinks with that I could think of for this approach that we will need to address.
How do we provide a way for the auto-instrumentation users to configure the authenticator?
- We will probably need a new environment variable
OTEL_OTLP_EXPORTER_AUTHENTICATORfor this configuration. The value need to be one of the registered authenticators such asoauth2,awssigv4, etc for the SDK to automatically plug-in the specified authenticator.
How can we make the interface signature of the authenticator agnostic to the exporter's http client request type?
- Across languages, the SpanExporter implementations use different http clients to create the export request. For example, Python uses
requestswhile Java usesOkHttpor native http client depending on user's configuration. For the authenticator to be called with the request as parameter, it needs to define a generic type. - Let's say the new type is called
OTLPExportRequest. The SpanExporters across the SDKs must first create this instance with the http method, url, body, etc. and pass it to the authenticator. Then the exporter must use the parts from thisOTLPExportRequestalong with the headers from the authenticator to construct the final export request.
A couple of notes/questions -
- Is this something that has to be at the SDK level, or could it be implemented solely in the Collector?
- Wouldn't this need to be for all signals? Wouldn't it make more sense to have this be a part of the OTLP exporter itself, rather than as an extension to the SpanExporter interface?
Thanks.
This should be a callable that receives the HTTP request data (headers, body, URL, etc.) and returns a dict of key-value pair(s) of authentication header back to the calling exporter.
In opentelemetry-java, we incrementally write the request body to the output stream. There's no place in the code where we have a full copy of the request bytes along with the other request data you discuss (headers, URL, etc).
Is this something that has to be at the SDK level, or could it be implemented solely in the Collector?
Its already is implemented in the collector. https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/sigv4authextension
For python one possibility is to make a sub class of the requests.Session object like is done here: https://google-auth.readthedocs.io/en/master/_modules/google/auth/transport/requests.html#AuthorizedSession to handle auth. The session object is already a parameter to the OTLP HTTP python exporters. I'm trying to add an environment variable to let the session object be passed in for auto-instrumentation in python.
Before looking into possible ways of implementing this in different languages (which sounds like it wouldn't be in a trivial change in some languages e.g. Java), I think it'd be worth clarifying if this is something we think needs to be in the SDK spec, or if it's something that can be handled at the Collector layer, and already supported.
Hello folks. Sorry, I was away for the last couple weeks and couldn't follow up here.
@danielgblanco I believe that adding custom/pre-defined authenticators can be handled within the Collector. As @jack-berg mentioned, the AWS SIGv4 extension is already present there. This discussion is mainly focused on having authenticators on the SDK side, and I think this does need a spec to be consistently implemented across languages. Let me know what you think.
This should be a callable that receives the HTTP request data (headers, body, URL, etc.) and returns a dict of key-value pair(s) of authentication header back to the calling exporter.
In
opentelemetry-java, we incrementally write the request body to the output stream. There's no place in the code where we have a full copy of the request bytes along with the other request data you discuss (headers, URL, etc).
Hmm I see. To clarify, do you mean to say that at no point will the exporter have the complete request? How does this work then? Where does the request gets fully formed?
2. Wouldn't this need to be for all signals? Wouldn't it make more sense to have this be a part of the OTLP exporter itself, rather than as an extension to the SpanExporter interface?
Yes. Ideally we should have this for all the signals exporters. I just took the example of SpanExporter for simplicity. The OTLP exporter is probably a higher level abstraction over the specific signals exporters where the actual request gets created, so for the purpose of the SigV4 having the auth mechanism at OTLP exporter level may not work.
- Is this something that has to be at the SDK level, or could it be implemented solely in the Collector?
@austinlparker I believe this is only needed in the SDK only if running a collector is not an option. What I gather from the relevant issues in GoLang and Python, the requirement is to have a dynamic authentication mechanism within the export pipeline, which can be and likely be more effective if addressed in collector.
At AWS, we do have use-cases where running a collector is not an option either due to platform or resource limitations and the SDK needs to communicate to backend's OTLP endpoint directly with SigV4 authentication. I would be interested in solving this problem on the SDK side for that reason. The hope is to do so in a way that can be generalized and extended to other authentication methods.
We'll discuss this at the next spec meeting!
Hmm I see. To clarify, do you mean to say that at no point will the exporter have the complete request? How does this work then? Where does the request gets fully formed?
It doesn't! The request doesn't need to be fully formed in memory. The request is constructed with a reference to the marshaler, which incrementally serializes and writes the bytes to the request output stream. See source code here: https://github.com/open-telemetry/opentelemetry-java/blob/main/exporters/sender/okhttp/src/main/java/io/opentelemetry/exporter/sender/okhttp/internal/OkHttpHttpSender.java#L127-L133
For sigv4 to work, I think users would have to accept a performance hit of serializing the full request body byte array in memory.
Summarizing the discussion from the spec meeting -- there was general agreement that this wasn't a community priority, but we would be happy to accept these changes if someone drove them to completion.
Summarizing the discussion from the spec meeting -- there was general agreement that this wasn't a community priority, but we would be happy to accept these changes if someone drove them to completion.
@austinlparker So does the community still think a spec is needed and want someone to drive the spec? Or do you mean a spec is not the priority and the mechanism should be implemented individually in each language as needed?
BTW for python I have a proposal for using an environment variable to configure auth for the OTLP exporters for auto instrumentation that is pretty similar to how the collector does it. If people are interested or have concerns please take a look, otherwise I think we should move forward with that for python
Summarizing the discussion from the spec meeting -- there was general agreement that this wasn't a community priority, but we would be happy to accept these changes if someone drove them to completion.
@austinlparker So does the community still think a spec is needed and want someone to drive the spec? Or do you mean a spec is not the priority and the mechanism should be implemented individually in each language as needed?
Sorry about the delay - if we wanted this across the entire ecosystem, we'd need a spec for it. We would need someone to drive implementation and spec work.
@austinlparker thanks for clarifying. Having this across all languages makes sense. I will work on getting an OTEP ready.
Hey folks! I built a working SigV4 OTLP exporter that addresses this exact issue: https://github.com/Stupidoodle/opentelemetry-exporter-otlp-proto-http-sigv4 https://github.com/open-telemetry/opentelemetry-python-contrib/issues/3494
Key features: Full SigV4 request signing using botocore Handles the "performance hit" @jack-berg mentioned by properly serializing request bodies Comprehensive retry logic and error handling Production-tested with LocalStack integration
Instead of waiting for spec discussions, I solved the immediate Lambda container problem. Happy to contribute this to the official effort or help inform the OTEP with real implementation learnings. The code handles edge cases we haven't discussed yet - like payload size limits, compression variants, and proper AWS credential chain integration.
Handles the "performance hit" @jack-berg mentioned by properly serializing request bodies
The performance hit is specific to the java implementation of OTLP exporters. To optimize performance, we iteratively serialize and write request bytes to the output stream, rather than serializing and holding an entire request body's bytes in memory. To my knowledge, there isn't a way to iteratively compute the sigv4 hash, and thus, there is no way to avoid the performance hit to the java implementation.