ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

opentelemetry_propagate do not work with auth endpoints

Open csepulveda opened this issue 1 year ago • 16 comments

What happened:

Enabling the open telemetry module, the traces are created by nginx and then propagated to the application behind. But when the backend is the authorization endpoint, the opentracing headers are not present. Here is an example of the auth-url configuration.

    nginx.ingress.kubernetes.io/auth-url: http://yyyy.xxx.svc.cluster.local/auth
    nginx.ingress.kubernetes.io/auth-cache-key: $http_authorization$arg_api_key$request_method
    nginx.ingress.kubernetes.io/auth-response-headers: partner_id

When the auth endpoint its called, there are no opentracing headers.

T 172.16.11.212:59146 -> 172.16.10.71:8080 [AP] #265
GET /auth HTTP/1.1.
X-Request-ID: 332a44b693c19f83fce1bdd929b2a165.
Host: yyyy.xxx.svc.cluster.local.
X-Original-URL: https://xxxx.yyyyy.zzzz/v2/admin/financings?sort=&page=1&per_page=100.
X-Original-Method: GET.
X-Sent-From: nginx-ingress-controller.
X-Real-IP: xxxxx.
X-Forwarded-For: xxxxx.
X-Auth-Request-Redirect: /v2/admin/financings?sort=&page=1&per_page=100.
Connection: close.
sec-ch-ua: "Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111".
accept: application/json.
sec-ch-ua-mobile: ?0.
authorization: eyJhbGcixxxx
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36.
sec-ch-ua-platform: "macOS".
origin: https://xxxx.yyyyy.zzzz
sec-fetch-site: cross-site.
sec-fetch-mode: cors.
sec-fetch-dest: empty.
referer: https://xxxx.yyyyy.zzzz/.
accept-encoding: gzip, deflate, br.
accept-language: en-US,en;q=0.9.

What you expected to happen: Maybe when the location its internal the header are not injected when the location its internal.

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

kubectl exec -it $POD_NAME -n $POD_NAMESPACE -- /nginx-ingress-controller --version
Defaulted container "controller" out of: controller, opentelemetry (init)
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.7.0
  Build:         72ff21ed9e26cb969052c753633049ba8a87ecf9
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.21.6

-------------------------------------------------------------------------------

Kubernetes version (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.10-eks-48e63af", GitCommit:"9176fb99b52f8d5ff73d67fea27f3a638f679f8a", GitTreeState:"clean", BuildDate:"2023-01-24T19:17:48Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}

Environment: AWS EKS 1.24

  • How was the ingress-nginx-controller installed:
my-ingress-nginx                                 	kube-system  	13      	2023-03-30 14:53:21.741429 -0300 -03   	deployed	ingress-nginx-4.6.0                  	1.7.0
USER-SUPPLIED VALUES:
controller:
  config:
    allow-snippet-annotations: true
    enable-opentelemetry: "true"
    enable-underscores-in-headers: true
    opentelemetry-config: /etc/nginx/opentelemtry.toml
    opentelemetry-operation-name: HTTP $request_method $service_name $uri
    opentelemetry-trust-incoming-span: "false"
    otel-max-export-batch-size: "512"
    otel-max-queuesize: "2048"
    otel-sampler: AlwaysOn
    otel-sampler-parent-based: "false"
    otel-sampler-ratio: "1.0"
    otel-schedule-delay-millis: "5000"
    otel-service-name: nginx-proxy
    otlp-collector-host: xxxx.yyyy.svc
    otlp-collector-port: "4317"
    proxy-real-ip-cidr: 172.16.0.0/16
    use-proxy-protocol: true
  opentelemetry:
    enabled: true
  replicaCount: 2
  service:
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
      service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "60"
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
      service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: proxy_protocol_v2.enabled=true
      service.beta.kubernetes.io/aws-load-balancer-type: nlb

How to reproduce this issue: Install the latest version. Enable the opentelemetry Set any backend with /auth-url

csepulveda avatar Mar 30 '23 19:03 csepulveda

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Mar 30 '23 19:03 k8s-ci-robot

/assign @esigo

longwuyuan avatar Mar 30 '23 19:03 longwuyuan

I'm not sure what you mean with opentracing headers. If you mean the context propagation, at the moment only w3s tracing context is supported. It's possible to support b3 context propagation here by changing it to opentelemetry_propagate b3. But it will not be possible to mix these contexts.

esigo avatar Mar 30 '23 20:03 esigo

Let me explain a little bit more.

When i make a request to the application behind nginx, if the request have to be authorized using auth-url, the opentracing headers (w3s) are propagated only to the app and not to the authorizer.

Request:

curl --location --request GET 'https://xx.yy.zz/v2/merchants' \
--header 'authorization: xxxxtoken'

headers on auth service

GET /auth HTTP/1.1.
X-Request-ID: 1988fbf01a6328935f2afb42ab0bd5d8.
Host: xxx-auth.yyy.svc.cluster.local.
X-Original-URL: https://xx.yy.zz/v2/merchants.
X-Original-Method: GET.
X-Sent-From: nginx-ingress-controller.
X-Real-IP: 11.99.99.99.
X-Forwarded-For: 11.99.99.99.
X-Auth-Request-Redirect: /v2/merchants.
Connection: close.
authorization: xxxxtoken.
User-Agent: PostmanRuntime/7.29.2.
Accept: */*.
Postman-Token: 845a33a4-6115-41bd-91da-435501c47b20.
Accept-Encoding: gzip, deflate, br.

headers on app

GET /merchants HTTP/1.1.
traceparent: 00-689d50c15cfac82080b3c70db777bc64-e9c02cfa0c878177-01.
partner_id: xxxx-xxx-xxx-xxx-xxxx.
Host: xx.yy.zz.
X-Request-ID: 1988fbf01a6328935f2afb42ab0bd5d8.
X-Real-IP: 11.99.99.99.
X-Forwarded-For: 11.99.99.99.
X-Forwarded-Host: xx.yy.zz.
X-Forwarded-Port: 443.
X-Forwarded-Proto: https.
X-Forwarded-Scheme: https.
X-Scheme: https.
authorization: xxxxtoken.
User-Agent: PostmanRuntime/7.29.2.
Accept: */*.
Postman-Token: 845a33a4-6115-41bd-91da-435501c47b20.
Accept-Encoding: gzip, deflate, br.

ingress configuration:

ingress:
    enabled: true
    className: "nginx"
    annotations:
      cert-manager.io/cluster-issuer: "letsencrypt"
      nginx.ingress.kubernetes.io/auth-url: http://xxx-auth.yyy.svc.cluster.local/auth
      nginx.ingress.kubernetes.io/auth-cache-key: $http_authorization$arg_api_key$request_method
      nginx.ingress.kubernetes.io/auth-response-headers: partner_id
      nginx.ingress.kubernetes.io/rewrite-target: /$2
      nginx.ingress.kubernetes.io/proxy-buffer-size: 256k
      nginx.ingress.kubernetes.io/proxy-body-size: 75m
      nginx.ingress.kubernetes.io/client-header-buffer-size: 128k
      nginx.ingress.kubernetes.io/opentracing-trust-incoming-span: "false"
      nginx.ingress.kubernetes.io/server-snippet: |
        location ~* "^/v2/debug/" {
            deny all;
            return 403;
          }
    hosts:
      - host: xx.zz.yy
        paths:
          - path: /v2(/|$)(.*)
            pathType: Prefix
  
    tls:
      - hosts:
        - xx.zz.yy
        secretName: xx.zz.yy

The idea to use the opentelemetry module on the ingress is to create the trace on nginx and propagate the traceparent header to both apps (the application and its authorizer) so in that way; I will be able to have all spans and timeline in the same traceId.

csepulveda avatar Mar 30 '23 21:03 csepulveda

Maybe its necessary to force the header using something like:

nginx.ingress.kubernetes.io/auth-url: http://foo.com/external-auth
nginx.ingress.kubernetes.io/auth-snippet: |
    proxy_set_header traceparent $traceparent;

Is there a way to get the traceparent as a variable?

csepulveda avatar Mar 30 '23 21:03 csepulveda

I tried the following configuration:

    nginx.ingress.kubernetes.io/auth-snippet: |
      proxy_set_header traceparent 123;

and i'm able to see the header on the authorized service:

...
X-Auth-Request-Redirect: /v2/offers.
traceparent: 123.
Connection: close.
....

The problem is when i try to use the variable "$opentelemetry_context_traceparent" instead 123 nothing shows up. It's like the variable opentelemetry_context_traceparent its created after the authorization 😢

csepulveda avatar Mar 30 '23 22:03 csepulveda

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

github-actions[bot] avatar May 01 '23 01:05 github-actions[bot]

Here is an example to reproduce the issue: https://github.com/csepulveda/nginx-example

The expected result is to see the service-c span inside the trace generated by Nginx.

Regards!

csepulveda avatar May 04 '23 17:05 csepulveda

this works with the following config:

daemon off;
error_log /dev/stdout info;

load_module /modules_mount/etc/nginx/modules/otel/otel_ngx_module.so;

events {
  worker_connections 1024;
}

http {
  opentelemetry on;
  opentelemetry_operation_name "http";
  opentelemetry_config /conf/otel-nginx.toml;
  access_log /dev/stdout;
  opentelemetry_capture_headers on;
  add_header Server-Timing "traceparent;desc=\"$opentelemetry_context_traceparent\"";
  opentelemetry_operation_name my_example_backend_root;
  server {
    listen 80;
    server_name localhost;

    location / {
      opentelemetry on;
      opentelemetry_operation_name my_example_backend;
      opentelemetry_propagate;
      opentelemetry_trust_incoming_spans on;
      proxy_pass http://service-a:80/;

      auth_request /auth;
      set $dummy_val "$opentelemetry_context_traceparent";
    }

    location = /auth {
      opentelemetry_operation_name my_example_auth;
      opentelemetry on;
      opentelemetry_propagate;
      opentelemetry_trust_incoming_spans on;
      internal;
      proxy_pass http://service-c:80/auth;
      proxy_pass_request_body off;
      proxy_set_header Content-Length "";
      proxy_set_header X-Original-URI $request_uri;
    }
  }
}

image

It should be possible to achieve the same config using annotations I think, I haven't tested though.

esigo avatar May 20 '23 19:05 esigo

@esigo from your example, I would expect my_example_auth operation to be one span added to the trace. Is my expectation correct?

eguzki avatar May 24 '23 18:05 eguzki

@esigo from your example, I would expect my_example_auth operation to be one span added to the trace. Is my expectation correct?

@eguzki yes, it's service-c in the snapshot I attached. But this span is actually coming from the service not from nginx. And auth span (service-c) is a child to the span that nginx created.

esigo avatar May 24 '23 18:05 esigo

I confirmed that the example posted earlier by @esigo works, in particular adding set $dummy_val "$opentelemetry_context_traceparent"; in the parent location block where auth_request is used.

I was able to do it with a snippet, as well:

      nginx.ingress.kubernetes.io/configuration-snippet: |
        set $dummy_val "$opentelemetry_context_traceparent";

Thanks, @esigo!

deejgregor avatar Jun 11 '23 05:06 deejgregor

I'll note that there's a similar issue with propagating to gRPC backend services in #10319.

deejgregor avatar Sep 14 '23 16:09 deejgregor

I confirmed that this snipped actually works, but can someone explain why and is there any plan to fix it and make it work out of the box?

 nginx.ingress.kubernetes.io/configuration-snippet: |
        set $dummy_val "$opentelemetry_context_traceparent";

zmitry avatar Apr 04 '24 16:04 zmitry

Hi, do you have any updates on this?

zmitry avatar Apr 26 '24 10:04 zmitry

I asked Copilot to explain that workaround:

The set $dummy_val "$opentelemetry_context_traceparent"; line is a workaround to ensure that the traceparent value is correctly propagated. In Nginx, variables are evaluated lazily. This means that if a variable is not used, it won’t be evaluated. By assigning the traceparent value to a dummy variable, you’re forcing Nginx to evaluate it, ensuring that the OpenTelemetry context is correctly propagated.

Is that accurate? The workaround worked for us though, so thanks!

andrewdinunzio avatar Jun 26 '24 18:06 andrewdinunzio

Esigo and other comments seems to be conclusive on the topic of this issue. So there is no action item on the project for this issue hence closing it.

/clsoe

longwuyuan avatar Sep 11 '24 11:09 longwuyuan

/close

longwuyuan avatar Sep 11 '24 11:09 longwuyuan

@longwuyuan: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Sep 11 '24 11:09 k8s-ci-robot

I tried to use the workaround today with the following results:

  1. When using nginx.ingress.kubernetes.io/auth-keepalive: "20" annotation in an ingress, the auth request is not part of the full request trace.
  2. When not using the keepalive annotation, the workaround is effective and the auth request is part of the full trace.

Any ideas how to get it working when using keepalive?

Edit: it starts to work when nginx.ingress.kubernetes.io/auth-keepalive-share-vars: "true" is set.

juissi-t avatar Sep 19 '24 13:09 juissi-t