opentelemetry-cpp-contrib icon indicating copy to clipboard operation
opentelemetry-cpp-contrib copied to clipboard

NGINX: Allow overriding `opentelemetry_trace_id` from within the config

Open NaurisSadovskis opened this issue 3 years ago • 14 comments

Is your feature request related to a problem? We use Google Load Balancers which add a vendor specific header (X-Cloud-Trace-Context) to our requests. It seems compatible with W3C format, but OTEL library does not detect it as it's not supported and creates a new span which gets passed down the line (together with Google header), resulting in two separate traces which are not connected.

To solve it, we tried is parsing the X-Cloud-Trace-Context header within NGINX and extracting trace ID and using that to construct a new header which is passed down the line:

http {
      # omitted: some parsing code to $trace_id from X-Cloud-Trace-Context header
      server {
          opentelemetry_propagate;
          opentelemetry_trust_incoming_spans on; 
      location / { 
          proxy_pass http://some_endpoint;

          # Construct a custom header using Google's trace ID and OTEL's generated Span ID
          set $custom_traceparent "00-${trace_id}-${opentelemetry_span_id}-01";

          # Set header
          proxy_set_header traceparent $custom_traceparent;  
        }
    } 
}

This solution passes a valid trace down the line, however it also results in two traces due to OTEL library generating it's own opentelemetry_trace_id which it then sends to the OLTP exporter host. It's not entirely clear to me at what point in the NGINX request flow OTEL generates its trace ID and I'm wondering if there is a way to override it. This can be visualised as follows:

graph TD
    A(Request) -->B(Google Load Balancer)
    B --> |Google adds: X-Cloud-Trace-Context|C(NGiNX)
    C -->|Custom traceparent - Trace A| D(Application)
    C -->|Default/Internal traceparent - Trace B| E(OLTP exporter host)

Describe the solution you'd like

  • Approach 1: Ideal solution would be to find out if there's a way to generate and set headers within NGINX before the OtelGetTraceId function gets called (code).
  • Approach 2: Alternatively, since NGINX does not seem to support overriding already set variables, perhaps the library functionality could be extended to allow overriding this value from within the code.

Thanks and I hope you can provide some additional context how to best solve it!

NaurisSadovskis avatar Oct 06 '22 08:10 NaurisSadovskis

Have you tried the otel-webserver-module for nginx as an alternative?

https://github.com/open-telemetry/opentelemetry-cpp-contrib/tree/main/instrumentation/otel-webserver-module

svrnm avatar Oct 06 '22 09:10 svrnm

I have not, but it seems it also does not support overriding incoming headers to construct new ones ?

NaurisSadovskis avatar Oct 06 '22 10:10 NaurisSadovskis

@DebajitDas , @kpratyus : can you comment on that?

svrnm avatar Oct 07 '22 06:10 svrnm

If the key name of Tracecontext coming from Google Load Balancer is same as the opentelemetry specification than the tracecontext will be overridden. Is the key name "X-Cloud-Trace-Context" because what otel is using is different?

kpratyus avatar Oct 07 '22 08:10 kpratyus

@kpratyus exactly - Google Load Balancers do not use traceparent header and in our chain the conversion needs to happen in NGINX so we also can instrument the entire flow. Currently our NGINX spans are missing as we're propagating Google header down the line since Python/etc has support for it.

After the initial post, I've tried using openresty/headers-more-nginx-module to see if I can override input headers and pre-process trace before it hits OTEL module, but due to possibly how modules are invoked, I couldn't to get it to work.

Side-note: Google might support traceparent eventually since some of their serverless products already do, but there are no public dates for it.

NaurisSadovskis avatar Oct 07 '22 11:10 NaurisSadovskis

@NaurisSadovskis The otel specification says that "traceparent" key should be used for context propogation. So unless there is a change in otel specification the agents are forced to use "traceparent" as key.

kpratyus avatar Oct 08 '22 15:10 kpratyus

I am not 100% sure how C++ does it in opentelemetry, but I think what is needed first, is general support for a cloud trace propagators like other languages have it, then ideally there should be a composite propagator that allows combination and then nginx module could support them

svrnm avatar Oct 10 '22 06:10 svrnm

For reference, right now there is b3, trace context & jaeger for C++:

https://github.com/open-telemetry/opentelemetry-cpp/tree/703576c0b8e59d321d544cabde1cff89b92c8436/api/include/opentelemetry/trace/propagation

svrnm avatar Oct 10 '22 06:10 svrnm

I should note that Google's format is compatible with W3C (if this was not immediately clear). Namely:

# Google: X-Cloud-Trace-Context
05d7c045f9cd3c474f9e6b733823fb44/14127515246394157611;o=1
  ^                                         ^                    
  |                                         | 
trace id (32 chars)                      span id
# W3C: traceparent
00-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01
    ^                                         ^                    
    |                                         | 
  trace id (32 chars)                      span id

Thus, if we're able to split the incoming Google header and override the opentelemetry_span_id variable set by NGINX module, we would not need to make custom propagators for a standard that might or might not be there in few years.

NaurisSadovskis avatar Oct 11 '22 08:10 NaurisSadovskis

I should note that Google's format is compatible with W3C (if this was not immediately clear). Namely:

# Google: X-Cloud-Trace-Context
05d7c045f9cd3c474f9e6b733823fb44/14127515246394157611;o=1
  ^                                         ^                    
  |                                         | 
trace id (32 chars)                      span id
# W3C: traceparent
00-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01
   ^                                         ^                    
   |                                         | 
 trace id (32 chars)                      span id

Thus, if we're able to split the incoming Google header and override the opentelemetry_span_id variable set by NGINX module, we would not need to make custom propagators for a standard that might or might not be there in few years.

But OTEL agents can only handle "traceparent" as the key not "X-Cloud-Trace-Context"

kpratyus avatar Nov 16 '22 07:11 kpratyus

Hi @kpratyus, just to update this thread - we have found a temporary workaround by using and modifying headers-more module (see below) to run at an earlier stage and build up a valid traceparent from the incoming X-Cloud-Trace-Context header. This causes this OTEL library to correctly pick up the header and propagate it.

We have also opened a feature request on Google's issue tracker: #253419736 Set traceparent in addition to x-cloud-trace-context.

diff --git a/src/ngx_http_headers_more_filter_module.c b/src/ngx_http_headers_more_filter_module.c
index 0bb6fec..a230f1b 100644
--- a/src/ngx_http_headers_more_filter_module.c
+++ b/src/ngx_http_headers_more_filter_module.c
@@ -244,7 +244,7 @@ ngx_http_headers_more_post_config(ngx_conf_t *cf)
 
     cmcf = ngx_http_conf_get_module_main_conf(cf, ngx_http_core_module);
 
-    h = ngx_array_push(&cmcf->phases[NGX_HTTP_REWRITE_PHASE].handlers);
+    h = ngx_array_push(&cmcf->phases[NGX_HTTP_SERVER_REWRITE_PHASE].handlers);
     if (h == NULL) {
         return NGX_ERROR;
     }

NaurisSadovskis avatar Nov 16 '22 11:11 NaurisSadovskis

Hi @NaurisSadovskis, We are facing the same issue and not able to propagate X-Cloud-Trace-Context received from gcp lb to nginx as traceparent header. Could you please share more details on how did you work around it?

GitKaran avatar Aug 02 '24 12:08 GitKaran

Just finding this issue today.

When using custom tags in trace propagation, the proper way is to implement a custom propagator, which is aware of the tags used.

See the following C++ APIs:

  • class opentelemetry::context::propagation::TextMapCarrier
  • class opentelemetry::context::propagation::TextMapPropagator

See examples in sub classes, like:

  • class opentelemetry::trace::propagation::B3PropagatorExtractor

See the global propagator singleton, which is designed to handle exactly this case:

  • class opentelemetry::context::propagation::GlobalTextMapPropagator

marcalff avatar Aug 02 '24 12:08 marcalff