nginx-opentracing
nginx-opentracing copied to clipboard
Support off-cluster jaeger collector
Hello,
I was trying to enable opentracing/jaeger support for an nginx controller. The collector endpoint is not on the same cluster.
In the docs it just says a valid url, not necessarily a valid url on the cluster? https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#jaeger-collector-host
The url is https terminated and reachable from inside the pod.
But I just got
Error: exit status 1
2020/01/21 10:51:02 [error] 2228#2228: opentracing_propagate_context before tracer loaded
nginx: [error] opentracing_propagate_context before tracer loaded
nginx: configuration file /tmp/nginx-cfg185864836 test failed
Is this scenario supposed to be supported? We have several spring boot applications that successfully use the collector endpoint.
Thanks, Kind regards, Kris
We have the same error-message after trying to enable opentracing in combination with jaeger-agents
I used this manual: https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentracing/
So I created a daemonset for the jaeger-agent which looks like:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: jaeger-agent
namespace: apm
labels:
app: jaeger-agent
spec:
selector:
matchLabels:
app: jaeger-agent
template:
metadata:
labels:
app: jaeger-agent
spec:
# added temporarily for testing purposes as it seems that the KEMP balancer with the current installed firmware-version doesn't support HTTP/2 well
hostAliases:
- ip: "10.12.100.11"
hostnames:
- "test-apm-server-grpc.somecorp.services"
containers:
- image: example-repo/jaegertracing/jaeger-agent:1.24
name: jaeger-agent
ports:
# port names must not be longer than 15 chars
#- name: jaeger-agent-thrift-compact-port
- name: jatc-port
containerPort: 6831
protocol: UDP
#- name: jaeger-agent-thrift-binary-port
- name: jatb-port
containerPort: 6832
protocol: UDP
#- name: jaeger-agent-serve-config-port
- name: jasc-port
containerPort: 5778
protocol: TCP
#- name: jaeger-agent-zipkin-thrift-compact-port
- name: jaztc-port
containerPort: 5775
protocol: UDP
#- name: jaeger-agent-admin-port
- name: jaa-port
containerPort: 14271
protocol: TCP
livenessProbe:
httpGet:
path: /
port: 14271
initialDelaySeconds: 60
periodSeconds: 5
readinessProbe:
httpGet:
path: /
port: 14271
initialDelaySeconds: 60
periodSeconds: 5
# as we use the "host.alias" workaround for the moment we need to skip cert-verification
args: ["--reporter.grpc.host-port=test-apm-server-grpc.somecorp.services:30443", "--reporter.grpc.tls.enabled=true", "--agent.tags='elastic-apm-auth=Bearer supersecurekey'", "--reporter.grpc.tls.skip-host-verify=true"]
resources:
requests:
memory: 1Gi
cpu: 0.1
limits:
memory: 1Gi
# so it cannot take all CPUs if something is wrong
cpu: 1
---
apiVersion: v1
kind: Service
metadata:
name: jaeger-agent
namespace: apm
labels:
app: jaeger-agent
spec:
selector:
app: jaeger-agent
ports:
- name: jatc-port
port: 6831
targetPort: 6831
protocol: UDP
- name: jatb-port
port: 6832
targetPort: 6832
protocol: UDP
- name: jasc-port
port: 5778
targetPort: 5778
protocol: TCP
- name: jaztc-port
port: 5775
targetPort: 5775
protocol: UDP
- name: jaa-port
port: 14271
targetPort: 14271
protocol: TCP
the agent seems to be running fine in it's own namespace, logs looking ok
then we have a small test image which should create some traces - the yaml is:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: jurisdictions
name: jurisdictions
namespace: dev-team
spec:
replicas: 1
selector:
matchLabels:
app: jurisdictions
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: jurisdictions
spec:
containers:
- env:
- name: ASPNETCORE_ENVIRONMENT
value: Development
image: example.repo/bold/test/jurisdiction:testapm
imagePullPolicy: Always
name: jurisdictions
ports:
- containerPort: 80
name: http
protocol: TCP
resources:
limits:
cpu: 500m
memory: 256Mi
imagePullSecrets:
- name: example
restartPolicy: Always
volumes:
- configMap:
defaultMode: 256
name: global.logging
optional: false
name: logging
---
apiVersion: v1
kind: Service
metadata:
labels:
app: jurisdictions
name: jurisdictions
namespace: dev-team
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: jurisdictions
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/enable-opentracing: "true"
jaeger-collector-host: "jaeger-agent.apm.svc.cluster.local"
jaeger-collector-port: "6831"
# also tried with this - nope
#jaeger-sampler-host: "http://jaeger-agent.apm.svc.cluster.local"
#jaeger-sampler-port: "5778"
jaeger-service-name: "jurisdictions-nginx"
labels:
app: jurisdictions
name: jurisdictions
namespace: dev-team
spec:
ingressClassName: "int-ingress-nginx"
rules:
- host: jurisdictions-testapm.lab
http:
paths:
- backend:
serviceName: jurisdictions
servicePort: 80
path: /
I tried this on 2 different kubernetes-clusters, one with nginx-ingress 0.35 and another one with 0.41 - both end up with the error-message
nginx: [error] opentracing_propagate_context before tracer loaded
from inside the nginx-ingress containers the "jaeger-agent.apm.svc.cluster.local" ports are reachable as far as I have seen.
The error message means that there is:
opentracing_propagate_context;
...
opentracing_load_tracer ...;
in the resulting NGINX config. So the order is wrong.
An alternative is patching the Ingress NGINX config template /etc/nginx/template/nginx.tmpl directly.
hmm, i see.
I'll have a detailed look asap - for me it feels like the https://kubernetes.github.io/ingress-nginx/user-guide/third-party-addons/opentracing/ is not complete. I think it is missing info about opentracing.json and how/where opentracing_load_tracer is to be configured. I assumed those things are then somehow configured by the annotations that you have to set.
I now tried several things with the nginx.tmpl and pulling my hair, because it's just superstrange. As noted in one of my posts before, we use here for the moment nginx 0.41.
As far as I have now dug through the nginx.tmpl I think I have a hint where the problem is, that the opentracing_load_tracer line is not added.
Looking at line 292 in the nginx.tmpl https://github.com/kubernetes/ingress-nginx/blob/controller-v0.41.0/rootfs/etc/nginx/template/nginx.tmpl#L292 I fear that this only would be "built", if you set the enable-opentracing parameter globally (btw I tried that too, and got the same error-message). But we want to avoid the global option, because we only will need opentracing for a few workloads.
Further down in the file at line 1065 https://github.com/kubernetes/ingress-nginx/blob/controller-v0.41.0/rootfs/etc/nginx/template/nginx.tmpl#L1065, you can find {{ buildOpentracingForLocation $all.Cfg.EnableOpentracing $location }} - but as it's in the "location"-section the opentracing_load_tracer would not help and is not added by this build-function (see https://github.com/kubernetes/ingress-nginx/blob/8aefb97fea4ebc429bb59921c82fdee0ab4d2a18/internal/ingress/controller/template/template.go#L1301). So I "copied" the "if"-statement from line 948-951 which looks like:
{{ if $all.Cfg.EnableOpentracing }}
opentracing on;
opentracing_propagate_context;
{{ end }}
and "pasted" it below line 943 so that the opentracing_load_tracer should land in the appropriate "server"-section:
{{ if $all.Cfg.EnableOpentracing }}
opentracing_load_tracer /usr/local/lib/libjaegertracing_plugin.so /etc/nginx/opentracing.json;
{{ end }}
But somehow also this is not working. If I would hardcode it into this section - I think it also would be suboptimal because then every "server"-section would have the parameter in there if I understand that correctly. IMHO something is wrong with the default nginx.tmpl respectively how the annotations are used by the go-script, which creates the nginx.conf from the nginx.tmpl