envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Add functionality to set trace ID equal to x-request-id in Envoy

Open kemko opened this issue 1 year ago • 8 comments

Hello guys!

We use nginx and envoy. For a long time on nginx, we made sure that x-request-id matched trace id. Now we want to phase out nginx, but we can't replicate the same logic at envoy.

Please add the functionality to set trace equal to x-request-id (if the header was absent in the original request).

kemko avatar Aug 20 '24 13:08 kemko

cc @wbpcode

adisuissa avatar Aug 20 '24 15:08 adisuissa

@kemko, could you elaborate more for how is your Envoy configured and provide more context for what you want?

Are you using a custom tracer extension? If so that can be achieved by modifying the header context. Also due to some historical reason, like the x-request-id will be changed 1 bit if a tracing is configured, and that can be controlled UuidRequestIdConfig.

botengyao avatar Aug 21 '24 00:08 botengyao

Currently we have Nginx, which we use for HTTPS termination and some other things, including generating x-request-id and traceparent headers if they are not passed with the request.

So right now there are no special settings in Envoy, just generate_request_id: true and the tracing section configured.

While checking how Envoy would work without Nginx in this place, I realized that x-request-id and trace id are completely different when both are generated by Envoy.

Since sending traces involves not only sending a header with the trace id, but also sending a span that should also contain the trace id, I abandoned the idea of modifying the traceparent header.

Then I tried to get the traceparent header to set x-request-id independently, but I couldn't get it to do that even with Lua:

envoy.yaml ---

static_resources: listeners: - name: listener_0 address: socket_address: address: 0.0.0.0 port_value: 8080 filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http generate_request_id: false tracing: provider: name: envoy.tracers.opentelemetry typed_config: "@type": type.googleapis.com/envoy.config.trace.v3.OpenTelemetryConfig grpc_service: envoy_grpc: cluster_name: jaeger timeout: 0.250s service_name: envoy route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] http_filters: - name: envoy.filters.http.header_to_metadata typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.header_to_metadata.v3.Config request_rules: - header: Traceparent on_header_present: metadata_namespace: envoy.lb key: traceparent type: STRING remove: false - name: envoy.filters.http.lua typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua inline_code: | function envoy_on_request(request_handle) request_handle:logCritical(request_handle:headers():get("traceparent") or "headers():get is empty") request_handle:logCritical((request_handle:streamInfo():dynamicMetadata():get("envoy.lb") and request_handle:streamInfo():dynamicMetadata():get("envoy.lb")["traceparent"]) or "dynamicMetadata():get is empty") end - name: envoy.filters.http.router typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

clusters: - name: echo_http connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: echo_http endpoints: - lb_endpoints: - endpoint: address: socket_address: address: echo-http port_value: 8080

- name: jaeger
  connect_timeout: 0.25s
  type: strict_dns
  lb_policy: round_robin
  typed_extension_protocol_options:
    envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
      "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
      explicit_http_config:
        http2_protocol_options: {}
  load_assignment:
    cluster_name: jaeger
    endpoints:
      - lb_endpoints:
          - endpoint:
              address:
                socket_address:
                  address: jaeger
                  port_value: 4317

admin: address: socket_address: address: 0.0.0.0 port_value: 8081

With this configuration I got headers():get is empty and dynamicMetadata():get is empty into the logs.

But of course it would be better to do without Lua at all.

P.S. Changing a bit won't be a problem if it's consistent in both headers.

kemko avatar Aug 22 '24 10:08 kemko

@botengyao Hello! Is this information sufficient, or do I need to provide anything else?

kemko avatar Sep 05 '24 10:09 kemko

@kemko, sorry for the delay.

I think it makes sense to add a knob in OpenTelemetry tracer to set trace id from x-request-id. Right now it is just using the ids from traceparent cc @alexanderellis, @yanavlasov

botengyao avatar Sep 18 '24 15:09 botengyao

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 18 '24 16:10 github-actions[bot]

issue still actual. @AlexanderEllis, @yanavlasov can you look to it?

kemko avatar Oct 18 '24 16:10 kemko

does setting the x-request-id as a tag helps?

wbpcode avatar Oct 20 '24 11:10 wbpcode