ingress-nginx opentracing: opentracing_propagate_context (when I try to turn on opentracing only for selected ingresses)

NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v0.50.0
  Build:         34a35a24cfef17aa1392b7fb2280f323b253c6b2
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.19.9

-------------------------------------------------------------------------------

How was the ingress-nginx-controller installed: by helm (CHART VERSION:ingress-nginx-3.40.0) helm values:

controller:  
  replicaCount: 3
  
  resources:
    limits:
      cpu: 2
      memory: 4Gi
    requests:
      cpu: 150m
      memory: 1Gi

  service:

    annotations: 
      metallb.universe.tf/address-pool: POOL
      
    loadBalancerIP: "EDITED"
    
   
    externalTrafficPolicy: "Local"

  priorityClassName: "800"
dhParam: "EDITED"

Ingress controler is up and running:

NAME                                       READY   STATUS    RESTARTS   AGE
ingress-nginx-controller-59565fcdc-hsdxm   1/1     Running   0          2d18h
ingress-nginx-controller-59565fcdc-kjzwh   1/1     Running   0          2d18h
ingress-nginx-controller-59565fcdc-r7b66   1/1     Running   0          2d18h

What happened: When i try to enable opentracing only for selected ingresses as documented (https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentracing/) by adding these annotations to ingress object:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/enable-opentracing: "true"
    jaeger-collector-host: jaegertracing-agent.jaegertracing.svc.cluster.local
    (...)

I am getting an error:

$ kubectl apply -f ingress-test-opentracing.yaml
Error from server (BadRequest): error when applying patch:
(...)
Resource: "extensions/v1beta1, Resource=ingresses", GroupVersionKind: "extensions/v1beta1, Kind=Ingress"
Name: "test-opentracing", Namespace: "test-opentracing"
for: "ingress-test-opentracing.yaml": admission webhook "validate.nginx.ingress.kubernetes.io" denied the request: 
-------------------------------------------------------------------------------
Error: exit status 1
2021/12/31 07:15:24 [warn] 545#545: the "http2_max_field_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx-cfg27920523:145
nginx: [warn] the "http2_max_field_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx-cfg27920523:145
2021/12/31 07:15:24 [warn] 545#545: the "http2_max_header_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx-cfg27920523:146
nginx: [warn] the "http2_max_header_size" directive is obsolete, use the "large_client_header_buffers" directive instead in /tmp/nginx-cfg27920523:146
2021/12/31 07:15:24 [warn] 545#545: the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /tmp/nginx-cfg27920523:147
nginx: [warn] the "http2_max_requests" directive is obsolete, use the "keepalive_requests" directive instead in /tmp/nginx-cfg27920523:147
2021/12/31 07:15:24 [error] 545#545: opentracing_propagate_context before tracer loaded
nginx: [error] opentracing_propagate_context before tracer loaded
nginx: configuration file /tmp/nginx-cfg27920523 test failed

if we omit errors of type "directive is obsolete", is stays:

2021/12/31 07:15:24 [error] 545#545: opentracing_propagate_context before tracer loaded
nginx: [error] opentracing_propagate_context before tracer loaded
nginx: configuration file /tmp/nginx-cfg27920523 test failed

What you expected to happen: According to the documentation (https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentracing/) and this PR: #https://github.com/kubernetes/ingress-nginx/pull/4983, it is possible to start opentracing only for selected ingresses:

What this PR does / why we need it:

Opentracing can be configured:

globally, using "enable-opentracing": "true" globally, using "enable-opentracing": "true" but disabled in an ingress with nginx.ingress.kubernetes.io/enable-opentracing: "false" globally disabled, using "enable-opentracing": "false" or with no setting but enabled in an ingress with nginx.ingress.kubernetes.io/enable-opentracing: "true"

Any of these options also require the conditional loading of the opentracing module and the specific tracer

however, the third option (no global opentracing configuration, only by set annotations on selected ingresses) does not seem to work. I found reports related to this error (opentracing_propagate_context) when opentracing was globally configured (by adding "enable-opentracing": "true" to config), so it seems that the problem with opentracing is wider: #https://github.com/kubernetes/ingress-nginx/issues/7970 #https://github.com/kubernetes/ingress-nginx/issues/6103#issuecomment-698930859

How to reproduce it: Create ingress with annotations:

metadata:
  annotations:
    nginx.ingress.kubernetes.io/enable-opentracing: "true"
    jaeger-collector-host: JAEGERTRACING-AGENT.ADDRESS

Dec 31 '21 07:12 rurus9

@rurus9: This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Dec 31 '21 07:12 k8s-ci-robot

I think I know what's going on. Me - and others - tried to run opentracing by configuring the collector with annotations. I believe it is impossible. The correct path is:

Configure globally in config collector (zipkin-collector-host, jaeger-collector-host or datadog-collector-host)
Activating opentracing in config globally: (enable-opentracing: "true") or with annotations for a single Ingress:

kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/enable-opentracing: "true"

Am I right?

Jan 03 '22 07:01 rurus9

I think I know what's going on. Me - and others - tried to run opentracing by configuring the collector with annotations. I believe it is impossible. The correct path is:
1. Configure globally in config collector (zipkin-collector-host, jaeger-collector-host or datadog-collector-host)

2. Activating opentracing in config globally:
   `(enable-opentracing: "true")`
   or with annotations for a single Ingress:
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/enable-opentracing: "true"
Am I right?

yes, it seems it's only possible to enable it globally. You then can add to each ingress-definition that you do NOT want to trace the enable-opentracing annotation set to false

kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/enable-opentracing: "false"

as far as I understood and tested. I have here finally a working ingress-nginx with opentracing enabled in combination with jaeger-agent which is ingesting the traces to an elastic APM-server.

if you add this annotation with true you'll see again errors like opentracing_propagate_context before tracer loaded in the ingress logs after applying. Also it seems that most of the nginx.ingress.kubernetes.io/... annotations do not overwrite the globally set ones - like

    nginx.ingress.kubernetes.io/jaeger-service-name: "python-flask-test"
    nginx.ingress.kubernetes.io/opentracing-operation-name: "python-flask-test"

or am I missing something?

Feb 15 '22 07:02 BBQigniter

You have a misconfiguration. My versions:

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v0.45.0
  Build:         7365e9eeb2f4961ef94e4ce5eb2b6e1bdb55ce5c
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.19.6
-------------------------------------------------------------------------------

My configs. I'm not using the all-in-one solution, so I use a collector service directly

Globally:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ingress-nginx-controller
data:
  jaeger-endpoint: "http://jaeger-collector.jaeger.svc.cluster.local:14268/api/traces"
  opentracing-trust-incoming-span: "true"
  jaeger-service-name: ingress-nginx
  opentracing-operation-name: "$request_method $host"
  opentracing-location-operation-name: "$namespace/$service_name"
...

Enable tracing for required Ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: "nginx"
    # Enable tracing
    nginx.ingress.kubernetes.io/enable-opentracing: "true"
...

if you want to specify some parameters for Ingress just rewrite global value by adding annotation:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/opentracing-trust-incoming-span: "false"
    nginx.ingress.kubernetes.io/jaeger-endpoint: "http://another-jaeger-collector.mynamespace.svc.cluster.local:14268/api/traces"
...

Feb 25 '22 18:02 vosmax

@vosmax hmm, tried it now similarly to your config, but without the globally enabled opentracing I can't get it to work. As soon as I enable it in the ingress-config of the workload, I again get nginx: [error] opentracing_propagate_context before tracer loaded

maybe this has something to do with jaeger-endpoint you are using. I have to use the jaeger-agent with jaeger-collector-host

but many thanks for the opentracing-operation-name and the other setting :) - now I see what those are doing and I really like that.

Here is the configmap I'm using for the ingress-controller (btw v1.1.1)

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    helm.sh/chart: ingress-nginx-4.0.15
    app.kubernetes.io/name: int-ingress-nginx
    app.kubernetes.io/instance: int-ingress-nginx
    app.kubernetes.io/version: 1.1.1
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: int-ingress-nginx-controller
  namespace: int-ingress-nginx
data:
  # have to enable it globally - else we see "opentracing_propagate_context before tracer loaded" if enabled via annotation in the workload's ingress config
  enable-opentracing: "true"
  jaeger-service-name: "at-lab-int-ingress-nginx"
  # pointing to jaeger-agent running on cluster
  jaeger-collector-host: "jaeger-agent.apm-agent.svc.cluster.local"
  # this should make jaeger's trace-format compatible with Elasticsearch APM feature
  jaeger-propagation-format: "w3c"
  # not sure if this has any effect
  jaeger-trace-context-header-name: "trace-id"
  # awesome - now clearer view in Kibana's APM app
  opentracing-trust-incoming-span: "true"
  opentracing-operation-name: "$request_method $host"
  opentracing-location-operation-name: "$namespace/$service_name"

Mar 01 '22 11:03 BBQigniter

@BBQigniter glad to know that I could partially help.

Based on nginx.tmpl file everything is in the right place I would suggest:

Pull out built nginx.conf and check that all is formatted and filled well
Try to use jaeger-endpoint instead of jaeger-collector-host to prove or disprove your believe that it could be a cause the issue

Mar 02 '22 14:03 vosmax

Any progress on this issue? We are facing a similar issue since upgrading the ingress controller helm chart a while ago. Funny enough, existing ingress configurations still work. We are just facing this problem, when trying to add new ingresses. For us, it seems to be a bug on the validation logic of the admission webhook since we have ingress configurations (populated to the nginx.conf of the controller pod) that still work and do what they are supposed to do. Only new ingresses or changes ingresses with the same config are causing problems.

This is really annoying at the moment because we cannot use traces from nginx with this anymore. We have activated opentracing globally on the controller and we need to configure the "opentracing_tag" directive for the ingress rule itself to get a correct tag value in datadog. Since we cannot set this directive globally, we have not yet found a way to get this working again.

UPDATE / FIX:

After debugging for quite a while now we found the actual problem: We have 3 nginx ingress controller deployed. The admission webhook controller are NOT restricted to the ingress class of the corresponding controller and are active for all ingress resources. That is why they fail because opentracing was in our case only enabled on a single ingress controller and not on all 3. So the admission webhook is working fine, it is just not limited correct. We have add labels to our ingresses now and use ObjectMatcher in the admission webhook controller to apply it only to those ingresses that it is responsible for.

Beside that, the documentation https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentracing/ is correct, opentracing has to be enabled globally as documented here and then switched off in ingresses where tracing is not wanted. If nothing is specifed, it is enabled by default then on each ingress.

THis page here https://kubernetes.github.io/ingress-nginx/user-guide/multiple-ingress/ should have an additional section describing potential issues with multiple admission webhook controllers and how to circumvent them. Ideally, the admission webhook would only listen to ingress having the ingress class defined for the controller. That way, the issue we had would be prevented.

Mar 30 '22 08:03 larsduelfer

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 28 '22 13:06 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Jul 28 '22 14:07 k8s-triage-robot

@BBQigniter After a while I've faced with this question again :) This is actually a proper way how to config jaeger agent daemonset for nginx controller. Unfortunatelly, the documentation doesn't show it properly and transparent. I have had to read the controller code to find it out.

To use node agent you need to get IP of the node

extraEnvs:
      - name: NODE_IP
        valueFrom:
          fieldRef:
            fieldPath: status.hostIP

Now you may use this env variable in config:

config:
      # Have to be disabled if you plan to enable in particular ingresses (not globally)  
      # enable-opentracing: "true"
      jaeger-collector-host: ${NODE_IP}

Now you have node IP in config

kubectl  exec  ingress-nginx-controller-557c5c6b8b-lpwgt   -- cat /etc/nginx/opentracing.json
{
  "service_name": "ingress-nginx",
  "propagation_format": "jaeger",
  "sampler": {
	"type": "ratelimiting",
	"param": 5,
	"samplingServerURL": "http://127.0.0.1:5778/sampling"
  },
  "reporter": {
	"endpoint": "",
	"localAgentHostPort": "172.30.85.174:6831"
  },
  "headers": {
	"TraceContextHeaderName": "",
	"jaegerDebugHeader": "",
	"jaegerBaggageHeader": "",
	"traceBaggageHeaderPrefix": ""
  }

Aug 22 '22 14:08 vosmax

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Sep 21 '22 15:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 21 '22 15:09 k8s-ci-robot