opentelemetry-operator icon indicating copy to clipboard operation
opentelemetry-operator copied to clipboard

Sidecar not set up

Open edenkoveshi opened this issue 3 years ago • 13 comments

Hi, I am now starting to work with this operator and OpenTelemetry. I am trying to set up a collector as sidecar and another one as a Deployment. The Deployment is set up but the sidecar isn't. The operator does not provide any useful logs (not even for the Deployment actually). Here are my YAMLs:

Opentelemetry Operator in namespace operators:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    control-plane: controller-manager
  name: otel-operator
  namespace: operators
spec:
  replicas: 1
  selector:
    matchLabels:
      app: otel-operator
  template:
    metadata:
      labels:
        app: otel-operator
    spec:
      containers:
      - env:
        - name: ENABLE_WEBHOOKS
          value: "false"
        image: ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator:v0.41.1
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8081
          initialDelaySeconds: 15
          periodSeconds: 20
        name: manager
        ports:
        - containerPort: 8080
          name: metrics
          protocol: TCP
        readinessProbe:
          httpGet:
            path: /readyz
            port: 8081
          initialDelaySeconds: 5
          periodSeconds: 10
        resources:
          limits:
            cpu: 200m
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 64Mi
      serviceAccountName: otel-operator
      terminationGracePeriodSeconds: 10

OpenTelemetry Collectors and application in namespace tracing-1:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otelcol-deployment
spec:
  mode: deployment #Works
  image: otel/opentelemetry-collector:0.41.0
  config: |
    receivers:
      jaeger:
        protocols:
          grpc:
      otlp:
        protocols:
          grpc:
          http:
    processors:

    exporters:
      jaeger:
        endpoint: "my-jaeger-collector-headless:14250"

    service:
      pipelines:
        traces:
          receivers: [otlp, jaeger]
          processors: []
          exporters: [jaeger]
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otelcol-sidecar
spec:
  mode: sidecar #Doesn't work
  image: otel/opentelemetry-collector:0.41.0
  config: |
    receivers:
      jaeger:
        protocols:
          grpc:
      otlp:
        protocols:
          grpc:
          http:
    processors:

    exporters:
      otlp:
        endpoint: "http://otelcol-deployment:4317"


    service:
      pipelines:
        traces:
          receivers: [otlp, jaeger]
          processors: []
          exporters: [otlp]

and sample application deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
      annotations:
        "sidecar.opentelemetry.io/inject": "otelcol-sidecar" # "true" does not work either
    spec:
      containers:
      - name: myapp
        image: jaegertracing/vertx-create-span:operator-e2e-tests
        ports:
        - containerPort: 8080
          protocol: TCP

I really don't understand why this doesn't work. Thanks in advance.

edenkoveshi avatar Jan 13 '22 16:01 edenkoveshi

Nevermind. I now realized the sidecar is injected within the existing container and not as a sidecar container as in the Jaeger operator. This is quite confusing.

edenkoveshi avatar Jan 13 '22 16:01 edenkoveshi

Wait, what? The sidecar is a second container on the same pod. Is this not what you are seeing?

Here's an e2e test that can be used for reference:

https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/smoke-sidecar/01-install-app.yaml

And this is the assertion (note that only a few nodes are asserted, the actual YAML is obviously bigger and more complete)

https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/smoke-sidecar/01-assert.yaml

jpkrohling avatar Jan 13 '22 17:01 jpkrohling

Sorry I had a little confusion. This indeed does not set up a sidecar container. The test case you provided does not work either. These are the operator logs:

{"level":"info","ts":1642318385.1218407,"msg":"Starting the OpenTelemetry Operator","opentelemetry-operator":"0.41.1","opentelemetry-collector":"otel/opentelemetry-collector:0.41.0","opentelemetry-targetallocator":"quay.io/opentelemetry/target-allocator:0.1.0","auto-instrumentation-java":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:1.7.2","auto-instrumentation-nodejs":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.26.0","auto-instrumentation-python":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.26b1","build-date":"2021-12-22T17:12:32Z","go-version":"go1.17.5","go-arch":"amd64","go-os":"linux"}
{"level":"info","ts":1642318385.1233158,"logger":"setup","msg":"the env var WATCH_NAMESPACE isn't set, watching all namespaces"}
{"level":"info","ts":1642318386.1691027,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1642318386.1823914,"logger":"setup","msg":"starting manager"}
{"level":"info","ts":1642318386.187153,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1642318386.1931977,"logger":"collector-upgrade","msg":"looking for managed instances to upgrade"}
{"level":"info","ts":1642318386.1991522,"logger":"instrumentation-upgrade","msg":"looking for managed Instrumentation instances to upgrade"}
{"level":"info","ts":1642318386.2004423,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.2105188,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.2129228,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.2132735,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.213576,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.2139266,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.2219572,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting EventSource","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","source":"kind source: /, Kind="}
{"level":"info","ts":1642318386.2220583,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting Controller","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector"}
{"level":"info","ts":1642318389.3165417,"logger":"collector-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":1642318391.4126122,"logger":"instrumentation-upgrade","msg":"no instances to upgrade"}
{"level":"info","ts":1642318391.4634194,"logger":"controller-runtime.manager.controller.opentelemetrycollector","msg":"Starting workers","reconciler group":"opentelemetry.io","reconciler kind":"OpenTelemetryCollector","worker count":1}

This is completely uninformative and I have no idea what's going on. Any help?

edenkoveshi avatar Jan 16 '22 08:01 edenkoveshi

Tried playing with this a little bit more. The operator does not seem to detect changes on the OpentelemetryCollector CR at all. I have one running in deployment mode (it did create the deployment) but when I change the configuration to include an additional exporter, for example, no changes are made (at least that's what I understand from the collector logs, I can't connect to the collector to debug it as there is no /bin/sh in the image, but there don't seem to be any changes). When I delete and re-create the collector, it does seem to change the configuration. The operator logs are the same.

edenkoveshi avatar Jan 16 '22 13:01 edenkoveshi

Are you able to provide us with your concrete steps and commands to reproduce this using minikube? Given that the test runs on every PR, I'm confident the feature is working, but you might be experiencing this in a situation we are not testing.

jpkrohling avatar Jan 17 '22 12:01 jpkrohling

Simply applied all the YAMLs above with kubectl. rbac taken from rbac directory, ClusterRoleBinding binds to the correct ServiceAccount in the operators namespace

edenkoveshi avatar Jan 18 '22 14:01 edenkoveshi

Any news on that? Still not working

edenkoveshi avatar Feb 23 '22 09:02 edenkoveshi

same issue here. my steps:

  1. install cert-manager (kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml)
  2. install operator (kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml)
  3. deploy service (kubectl apply -f deployment.yaml)

I have used the deployment from the e2e test:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: sidecar-for-my-app
spec:
  mode: sidecar
  config: |
    receivers:
      jaeger:
        protocols:
          grpc:
    processors:
    exporters:
      logging:
    service:
      pipelines:
        traces:
          receivers: [jaeger]
          processors: []
          exporters: [logging]

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment-with-sidecar
spec:
  selector:
    matchLabels:
      app: my-pod-with-sidecar
  replicas: 1
  template:
    metadata:
      labels:
        app: my-pod-with-sidecar
      annotations:
        sidecar.opentelemetry.io/inject: "true"
    spec:
      containers:
      - name: myapp
        image: ealen/echo-server
(base) ➜  poc kubectl get pods --all-namespaces
NAMESPACE                       NAME                                                         READY   STATUS    RESTARTS         AGE
cert-manager                    cert-manager-6dd9658548-5kwd9                                1/1     Running   9 (5m54s ago)    8d
cert-manager                    cert-manager-cainjector-5987875fc7-dcnjf                     1/1     Running   16 (5m54s ago)   8d
cert-manager                    cert-manager-webhook-7b4c5f579b-x8l74                        1/1     Running   16 (5m ago)      8d
default                         my-deployment-with-sidecar-fd49d7558-rb494                   2/2     Running   0                3m37s
kube-system                     coredns-6d4b75cb6d-d9np5                                     1/1     Running   10 (5m54s ago)   19d
kube-system                     coredns-6d4b75cb6d-nqhl2                                     1/1     Running   10 (5m54s ago)   19d
kube-system                     etcd-docker-desktop                                          1/1     Running   10 (5m54s ago)   19d
kube-system                     kube-apiserver-docker-desktop                                1/1     Running   10 (5m54s ago)   19d
kube-system                     kube-controller-manager-docker-desktop                       1/1     Running   10 (5m54s ago)   19d
kube-system                     kube-proxy-wc9zm                                             1/1     Running   10 (5m54s ago)   19d
kube-system                     kube-scheduler-docker-desktop                                1/1     Running   10 (5m54s ago)   19d
kube-system                     storage-provisioner                                          1/1     Running   17 (5m3s ago)    19d
kube-system                     vpnkit-controller                                            1/1     Running   383 (5m ago)     19d
kubernetes-dashboard            dashboard-metrics-scraper-7bfdf779ff-crp8j                   1/1     Running   10 (5m54s ago)   19d
kubernetes-dashboard            kubernetes-dashboard-6cdd697d84-r7bnk                        1/1     Running   11 (5m54s ago)   19d
monitoring                      grafana-59b48c5bb5-j4j64                                     1/1     Running   10 (5m54s ago)   19d
opentelemetry-operator-system   opentelemetry-operator-controller-manager-79d8468dcf-zxkrf   2/2     Running   3 (5m ago)       26m
otel                            jaeger-all-in-one-ff896dfc7-7wtlq                            1/1     Running   10 (5m54s ago)   12d
otel                            poc-service-a-68b78d7b76-bfkn5                               1/1     Running   1 (5m54s ago)    56m
otel                            prometheus-6dcdf8dd45-7wwp9                                  1/1     Running   10 (5m54s ago)   12d
otel                            zipkin-all-in-one-9668844cb-5ztm5                            1/1     Running   10 (5m54s ago)   12d

keu avatar Jul 12 '22 14:07 keu

@keu, could you please confirm the Otel Operator version you have installed?

yuriolisa avatar Jul 15 '22 11:07 yuriolisa

@keu, could you please confirm the Otel Operator version you have installed?

@yuriolisa thank you for the quick response, how can I check it? I just did

kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml

keu avatar Jul 17 '22 07:07 keu

Same here using 0.54.0. To be more accurate, If I manually kill the pod, the sidecar does get injected after re-creation...

The operator controller says :

error failed to select an OpenTelemetry Collector instance for this pod's sidecar
no OpenTelemetry Collector instances available
github.com/open-telemetry/opentelemetry-operator/internal/webhookhandler.(*podSidecarInjector).Handle
   /workspace/internal/webhookhandler/webhookhandler.go:92
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).Handle
   /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/admission/webhook.go:146
sigs.k8s.io/controller-runtime/pkg/webhook/admission.(*Webhook).ServeHTTP
   /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/webhook/admission/http.go:99
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1
   /go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promhttp/instrument_server.go:40
net/http.HandlerFunc.ServeHTTP
   /usr/local/go/src/net/http/server.go:2084
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1
   /go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promhttp/instrument_server.go:101
net/http.HandlerFunc.ServeHTTP
   /usr/local/go/src/net/http/server.go:2084
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2
   /go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promhttp/instrument_server.go:76
net/http.HandlerFunc.ServeHTTP
   /usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
   /usr/local/go/src/net/http/server.go:2462
net/http.serverHandler.ServeHTTP
   /usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
   /usr/local/go/src/net/http/server.go:1966

jtama avatar Jul 29 '22 09:07 jtama

The operator does not seem to detect changes on the OpentelemetryCollector CR at all.

@edenkoveshi from my understanding thats intended. There is a request for an auto-update feature https://github.com/open-telemetry/opentelemetry-operator/issues/553


same issue here.

@keu could you explain what issue you face? created a new kind cluster and followed your instructions. I end up with default my-deployment-with-sidecar-fd49d7558-rb494 2/2 Running 0 3m37s. It contains:

Containers:
  myapp:
    Container ID:   containerd://a886e3f4498e70b31d8a3d99ea515864ac7ceec58fa70e4039e8d2e8886910ad
    Image:          ealen/echo-server
    Image ID:       docker.io/ealen/echo-server@sha256:bda7884be159b8b5a06a6eb9c7d066d4f9667bed1c68e4ab1564c5f48079b46e
  ...
  otc-container:
    Container ID:  containerd://e5e1747001f9c4b1fad931ceb2a17a3a0df2c9e0fd7285c63b5315ec61a53067
    Image:         ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector:0.56.0
    Image ID:      ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector@sha256:3fe065c828e718b464af35d79cf9017d0f4ca3c0e6f444c515d0ff36c020d41c
    Args:
      --config=/conf/collector.yaml
    ...

Then i changed my configuration kubectl edit opentelemetrycollectors.opentelemetry.io sidecar-for-my-app.

Operator logs say nothing special:

{"level":"info","ts":1659088512.2854543,"logger":"opentelemetrycollector-resource","msg":"default","name":"sidecar-for-my-app"}
{"level":"info","ts":1659088512.287057,"logger":"opentelemetrycollector-resource","msg":"validate update","name":"sidecar-for-my-app"}

Next i applied the changes using kubectl rollout restart deployment my-deployment-with-sidecar.


Same here using 0.54.0. To be more accurate, If I manually kill the pod, the sidecar does get injected after re-creation...

@jtama could you confirm that everything was up and running? since i tried to do the same using kubectl delete pod but it didnt produce a panic.

frzifus avatar Jul 29 '22 10:07 frzifus

Actually, we found out the annotated deployment was created before the sidecar CRD was. We added an hint to help helm ordering the deployment and now everything works fine.

jtama avatar Jul 29 '22 11:07 jtama

As @jtama confirmed, we can close this issue.

yuriolisa avatar Dec 06 '22 11:12 yuriolisa