opentelemetry-helm-charts
opentelemetry-helm-charts copied to clipboard
CrashLoopBackoff error when starting pod
Error: failed to start container "opentelemetry-collector": Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "/otelcol-contrib": stat /otelcol-contrib: no such file or directory: unknown
Hello @Sandeepb-nextcar, can you provide more details on the issue you are having? What version of the charts are you using, what does your values.yaml look like?
I deployed otel agent as a daemonset and while starting up i'm seeing Crashloopbackoff error. I'm using the latest version.
Values.yaml config: processors: resourcedetection: detectors: [env] timeout: 5s override: false k8sattributes: passthrough: true batch: timeout: 10s service: pipelines: traces: processors: [batch, resourcedetection, k8sattributes]
image: repository: otel/opentelemetry-collector-contrib pullPolicy: IfNotPresent tag: latest
extraEnvs:
- name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP
- name: OTEL_RESOURCE_ATTRIBUTES value: "k8s.pod.ip=$(POD_IP)"
agentCollector: configOverride: receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: batch: timeout: 10s resourcedetection: detectors: [ec2, system] exporters: datadog/api: env: qa tags: - service:test api: key: ""
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, resourcedetection, k8sattributes]
exporters: [datadog/api]
What happens if you set image.tag
to 0.48.0
instead of latest
Are you running this in k8s or a kind cluster?
we are running on EKS, when i ran this yesterday it did pick 0.48.0, however i did apply again today which picked latest but still the same issue.
Did this break with helm chart 0.14.1 or is this your first time installing?
Nope we had other environment running fine on version 0.8.1
and Yes using recent chart version it failed with the error.
If you go back to the previous version that was working does it still work?
it did work
I was only able to reproduce this issue on chart version 0.14.1 if tag
is set to latest
. If tag
is 0.48.0
then I could install your values.yaml as expected.
Here is the yaml I was able to install successfully
config:
processors:
resourcedetection:
detectors: [env]
timeout: 5s
override: false
k8sattributes:
passthrough: true
batch:
timeout: 10s
service:
pipelines:
traces:
processors: [batch, resourcedetection, k8sattributes]
image:
repository: otel/opentelemetry-collector-contrib
pullPolicy: IfNotPresent
tag: 0.48.0
extraEnvs:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: OTEL_RESOURCE_ATTRIBUTES
value: "k8s.pod.ip=$(POD_IP)"
agentCollector:
configOverride:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 10s
resourcedetection:
detectors: [ec2, system]
exporters:
datadog/api:
env: qa
tags:
- service:test
api:
key: "asdafsdf"
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, resourcedetection, k8sattributes]
exporters: [datadog/api]
I set the command to otelcol
and that fixed the issue.
I got that from default image entrypoint: https://hub.docker.com/layers/otel/opentelemetry-collector/0.85.0/images/sha256-7fcfeb3982b8ddd85f5ccda88a46f4b43738a3faef43652c5ec5f02fee6c6c93?context=explore
Ok I think I understand this issue better.
There are multiple distributions for the opentelemetry-collector
. There is core, contrib, aws, etc.
This dockerhub image uses core: https://hub.docker.com/r/otel/opentelemetry-collector/tags
There is a second dockerhub image for contrib: https://hub.docker.com/r/otel/opentelemetry-collector-contrib/tags
The helm chart assumes that the contrib one is being used (otelcol-contrib
). I was originally using the core one with this binary /otelcol
.
So if you are running into this issue, I would double check which docker image is being used.