jaeger-operator
                                
                                
                                
                                    jaeger-operator copied to clipboard
                            
                            
                            
                        [Bug]: Adding query.http.tls.enabled=true breaks jaeger-query
What happened?
I want to setup Jaeger with TLS enabled for a secure connection between Grafana and Jaeger as data source. When adding the option query.http.tls.enabled=true to my deployment yaml, the jaeger-query pod does not start. Collector and agent do start with the same option set to true.
Setting option query.http.tls.enabled=false works as expected. The option should be supported (https://www.jaegertracing.io/docs/1.36/cli/). Im guessing it's a bug in Jaeger. However, some help or examples if I did something wrong would be appreciated (I haven't found any documentation or examples on how to work with Jaeger + TLS).
Changing the strategy to "all-in-one" does allow the pods to start up, but there is still no TLS.
Steps to reproduce
Deploy Jaeger operator on K8s cluster. Deploy Jaeger (see Deployment configs).
Expected behavior
Jaeger-query pod starts and has enabled TLS on the server.
Relevant log output
{"level":"info","ts":1666868672.982716,"caller":"app/server.go:273","msg":"Starting HTTP server","port":16686,"addr":":16686"}
{"level":"error","ts":1666868672.9827507,"caller":"app/server.go:284","msg":"Could not start HTTP server","error":"open : no such file or directory","stacktrace":"github.com/jaegertracing/jaeger/cmd/query/app.(*Server).Start.func1\n\tgithub.com/jaegertracing/jaeger/cmd/query/app/server.go:284"}
Screenshot
No response
Additional context
No response
Jaeger backend version
v1.36.0
SDK
No response
Pipeline
No response
Stogage backend
ElasticSearch 7
Operating system
No response
Deployment model
Kubernetes
Deployment configs
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata: 
  name: jaeger
  namespace: observability
spec:
  agent:
    strategy: Sidecar
    options:
      admin.http.tls.enabled: true
  query:
    options:
      query.http.tls.enabled: true
  collector:
    options:
      collector.http.tls.enabled: true
    maxReplicas: 5
    resources:
      limits:
        cpu: 100m
        memory: 128Mi
  storage: 
    options: 
      es: 
        server-urls: https://elastic-es-http.elastic:9200
        tls:
          ca: /es/certificates/ca.crt
          enabled: true
          skip-host-verify: false
    secretName: jaeger-secret
    type: elasticsearch
  volumeMounts:      
      - mountPath: "/es/certificates/"
        name: certificates
        readOnly: true 
  volumes:
      - name: certificates
        secret:
          secretName: elastic-es-http-certs-public
          items: 
            - key: ca.crt
              path: ca.crt
  strategy: production
                                    
                                    
                                    
                                
Same issue here, very strange.
Is the deployment created? If yes, could you share the event log?
Deployment is created yes. kubectl events gives:
LAST SEEN   TYPE      REASON              OBJECT                                   MESSAGE
52s         Normal    Scheduled           pod/jaeger-collector-7b65ff566f-n5mn8    Successfully assigned observability/jaeger-collector-7b65ff566f-n5mn8 to ip-xxxxxxxxxxx.eu-central-1.compute.internal
50s         Normal    Pulling             pod/jaeger-collector-7b65ff566f-n5mn8    Pulling image "jaegertracing/jaeger-collector:1.35.2"
45s         Normal    Pulled              pod/jaeger-collector-7b65ff566f-n5mn8    Successfully pulled image "jaegertracing/jaeger-collector:1.35.2" in 4.268128423s
23s         Normal    Created             pod/jaeger-collector-7b65ff566f-n5mn8    Created container jaeger-collector
23s         Normal    Started             pod/jaeger-collector-7b65ff566f-n5mn8    Started container jaeger-collector
23s         Normal    Pulled              pod/jaeger-collector-7b65ff566f-n5mn8    Container image "jaegertracing/jaeger-collector:1.35.2" already present on machine
7s          Warning   BackOff             pod/jaeger-collector-7b65ff566f-n5mn8    Back-off restarting failed container
22s         Warning   Unhealthy           pod/jaeger-collector-7b65ff566f-n5mn8    Readiness probe failed: Get "http://[xxxxxxx]:14269/": dial tcp [xxxxxxx]:14269: connect: connection refused
52s         Normal    SuccessfulCreate    replicaset/jaeger-collector-7b65ff566f   Created pod: jaeger-collector-7b65ff566f-n5mn8
53s         Normal    ScalingReplicaSet   deployment/jaeger-collector              Scaled up replica set jaeger-collector-7b65ff566f to 1
52s         Normal    Scheduled           pod/jaeger-query-57f9948bcc-gzsc4        Successfully assigned observability/jaeger-query-57f9948bcc-gzsc4 to ip-xxxxxxxx.eu-central-1.compute.internal
50s         Normal    Pulling             pod/jaeger-query-57f9948bcc-gzsc4        Pulling image "jaegertracing/jaeger-query:1.35.2"
46s         Normal    Pulled              pod/jaeger-query-57f9948bcc-gzsc4        Successfully pulled image "jaegertracing/jaeger-query:1.35.2" in 4.45897922s
17s         Normal    Created             pod/jaeger-query-57f9948bcc-gzsc4        Created container jaeger-query
17s         Normal    Started             pod/jaeger-query-57f9948bcc-gzsc4        Started container jaeger-query
45s         Normal    Pulling             pod/jaeger-query-57f9948bcc-gzsc4        Pulling image "jaegertracing/jaeger-agent:1.35.2"
42s         Normal    Pulled              pod/jaeger-query-57f9948bcc-gzsc4        Successfully pulled image "jaegertracing/jaeger-agent:1.35.2" in 3.114319514s
42s         Normal    Created             pod/jaeger-query-57f9948bcc-gzsc4        Created container jaeger-agent
42s         Normal    Started             pod/jaeger-query-57f9948bcc-gzsc4        Started container jaeger-agent
17s         Normal    Pulled              pod/jaeger-query-57f9948bcc-gzsc4        Container image "jaegertracing/jaeger-query:1.35.2" already present on machine
7s          Warning   BackOff             pod/jaeger-query-57f9948bcc-gzsc4        Back-off restarting failed container
52s         Normal    SuccessfulCreate    replicaset/jaeger-query-57f9948bcc       Created pod: jaeger-query-57f9948bcc-gzsc4
53s         Normal    ScalingReplicaSet   deployment/jaeger-query                  Scaled up replica set jaeger-query-57f9948bcc to 1
                                    
                                    
                                    
                                
Oh seems like the pod is crashing. Could you also provide the logs from your pod?
I've shared the relevant log output in the section Relevant log output in the original post. After that part the pod shuts down. Let me know if you need more logs than that and I'll dig them up
Ahh i see. It says "Could not start HTTP server","error":"open : no such file or directory". Looks like you need to define the path to the certs too, right?
https://github.com/jaegertracing/jaeger/blob/v1.36.0/pkg/config/tlscfg/flags.go#L60-L69
I did include a clientCA, and the error persisted. Are all the options listed in the url required to be set (e.g. tlsCert, tlsKey, tlsClientCA)?
I havent had time yet to test putting all options in, but will report here when I do.
What gets me confused is when I create the deployment with the all-in-one strategy, it does not show an error when I put query.http.tls.enabled: true, and the pod and http server start without issue.
Adding tls.cert and tls.key worked, thanks. Though I am now running into https://github.com/jaegertracing/jaeger/issues/2976. Even with a clean Jaeger installation. And even when rolling back to using just http.
As for the TLS part, it would be useful to have a more descriptive error message other than "no such file or directory"