opentelemetry-helm-charts icon indicating copy to clipboard operation
opentelemetry-helm-charts copied to clipboard

CrashLoopBackoff error when starting pod

Open Sandeepb-nextcar opened this issue 2 years ago • 11 comments

Error: failed to start container "opentelemetry-collector": Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/otelcol-contrib": stat /otelcol-contrib: no such file or directory: unknown

Sandeepb-nextcar avatar Apr 15 '22 14:04 Sandeepb-nextcar

Hello @Sandeepb-nextcar, can you provide more details on the issue you are having? What version of the charts are you using, what does your values.yaml look like?

TylerHelmuth avatar Apr 15 '22 16:04 TylerHelmuth

I deployed otel agent as a daemonset and while starting up i'm seeing Crashloopbackoff error. I'm using the latest version.

Values.yaml config: processors: resourcedetection: detectors: [env] timeout: 5s override: false k8sattributes: passthrough: true batch: timeout: 10s service: pipelines: traces: processors: [batch, resourcedetection, k8sattributes]

image: repository: otel/opentelemetry-collector-contrib pullPolicy: IfNotPresent tag: latest

extraEnvs:

  • name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP
  • name: OTEL_RESOURCE_ATTRIBUTES value: "k8s.pod.ip=$(POD_IP)"

agentCollector: configOverride: receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: batch: timeout: 10s resourcedetection: detectors: [ec2, system] exporters: datadog/api: env: qa tags: - service:test api: key: ""

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [batch, resourcedetection, k8sattributes]
        exporters: [datadog/api]

Sandeepb-nextcar avatar Apr 15 '22 17:04 Sandeepb-nextcar

What happens if you set image.tag to 0.48.0 instead of latest

TylerHelmuth avatar Apr 15 '22 17:04 TylerHelmuth

Are you running this in k8s or a kind cluster?

TylerHelmuth avatar Apr 15 '22 17:04 TylerHelmuth

we are running on EKS, when i ran this yesterday it did pick 0.48.0, however i did apply again today which picked latest but still the same issue.

Sandeepb-nextcar avatar Apr 15 '22 18:04 Sandeepb-nextcar

Did this break with helm chart 0.14.1 or is this your first time installing?

TylerHelmuth avatar Apr 15 '22 18:04 TylerHelmuth

Nope we had other environment running fine on version 0.8.1

Sandeepb-nextcar avatar Apr 15 '22 19:04 Sandeepb-nextcar

and Yes using recent chart version it failed with the error.

Sandeepb-nextcar avatar Apr 15 '22 19:04 Sandeepb-nextcar

If you go back to the previous version that was working does it still work?

TylerHelmuth avatar Apr 15 '22 19:04 TylerHelmuth

it did work

Sandeepb-nextcar avatar Apr 15 '22 20:04 Sandeepb-nextcar

I was only able to reproduce this issue on chart version 0.14.1 if tag is set to latest. If tag is 0.48.0 then I could install your values.yaml as expected.

Here is the yaml I was able to install successfully

config:
  processors:
    resourcedetection:
      detectors: [env]
      timeout: 5s
      override: false
    k8sattributes:
      passthrough: true
    batch:
      timeout: 10s
  service:
    pipelines:
      traces:
        processors: [batch, resourcedetection, k8sattributes]

image:
  repository: otel/opentelemetry-collector-contrib
  pullPolicy: IfNotPresent
  tag: 0.48.0

extraEnvs:
  - name: POD_IP
    valueFrom:
      fieldRef:
        fieldPath: status.podIP
  - name: OTEL_RESOURCE_ATTRIBUTES
    value: "k8s.pod.ip=$(POD_IP)"

agentCollector:
  configOverride:
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
          http:
            endpoint: 0.0.0.0:4318
    processors:
      batch:
        timeout: 10s
      resourcedetection:
        detectors: [ec2, system]
    exporters:
      datadog/api:
        env: qa
        tags:
          - service:test
        api:
          key: "asdafsdf"

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch, resourcedetection, k8sattributes]
          exporters: [datadog/api]

TylerHelmuth avatar Apr 15 '22 21:04 TylerHelmuth

I set the command to otelcol and that fixed the issue.

I got that from default image entrypoint: https://hub.docker.com/layers/otel/opentelemetry-collector/0.85.0/images/sha256-7fcfeb3982b8ddd85f5ccda88a46f4b43738a3faef43652c5ec5f02fee6c6c93?context=explore

alecrajeev avatar Sep 26 '23 16:09 alecrajeev

Ok I think I understand this issue better.

There are multiple distributions for the opentelemetry-collector. There is core, contrib, aws, etc.

This dockerhub image uses core: https://hub.docker.com/r/otel/opentelemetry-collector/tags

There is a second dockerhub image for contrib: https://hub.docker.com/r/otel/opentelemetry-collector-contrib/tags

The helm chart assumes that the contrib one is being used (otelcol-contrib). I was originally using the core one with this binary /otelcol.

So if you are running into this issue, I would double check which docker image is being used.

alecrajeev avatar Sep 26 '23 18:09 alecrajeev