opentelemetry-operator
opentelemetry-operator copied to clipboard
Node.js pod CrashLoopBackOff after auto-instrumenting
Component(s)
No response
What happened?
Description
Node.js application enters CrashLoop backoff when auto-instrumentation is enabled:
apiVersion: v1
kind: Pod
metadata:
annotations:
instrumentation.opentelemetry.io/container-names: unleash
instrumentation.opentelemetry.io/inject-nodejs: my-system/management-features
creationTimestamp: "2024-02-21T19:48:21Z"
generateName: my-demo-b64c6d87b-
labels:
app.kubernetes.io/created-by: controller-manager
app.kubernetes.io/instance: my-demo
app.kubernetes.io/name: Unleash
app.kubernetes.io/part-of: unleasherator
pod-template-hash: b64c6d87b
name: my-demo-b64c6d87b-w72hh
namespace: bifrost-unleash
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: my-demo-b64c6d87b
uid: bd441ad4-6ada-4e92-b2fd-ec17a9ae9a44
resourceVersion: "520000502"
uid: 4a93c92e-3df2-4fc3-a611-1bdd4abe1ee6
spec:
containers:
- env:
- name: INIT_ADMIN_API_TOKENS
valueFrom:
secretKeyRef:
key: token
name: unleasherator-my-demo-admin-key
- name: DATABASE_PASS
valueFrom:
secretKeyRef:
key: POSTGRES_PASSWORD
name: my-demo
- name: DATABASE_USER
valueFrom:
secretKeyRef:
key: POSTGRES_USER
name: my-demo
- name: DATABASE_NAME
valueFrom:
secretKeyRef:
key: POSTGRES_DB
name: my-demo
- name: DATABASE_HOST
value: localhost
- name: DATABASE_PORT
value: "5432"
- name: DATABASE_SSL
value: "false"
- name: DATABASE_URL
value: postgres://$(DATABASE_USER):$(DATABASE_PASS)@$(DATABASE_HOST):$(DATABASE_PORT)/$(DATABASE_NAME)
- name: GOOGLE_IAP_AUDIENCE
value: /projects/898056957967/global/backendServices/6771496285844745965
- name: TEAMS_API_URL
value: http://teams-backend.my-system.svc/query
- name: TEAMS_API_TOKEN
valueFrom:
secretKeyRef:
key: token
name: teams-api-token
- name: TEAMS_ALLOWED_TEAMS
value: aura,frontendplattform
- name: LOG_LEVEL
value: warn
- name: DATABASE_POOL_MAX
value: "3"
- name: DATABASE_POOL_IDLE_TIMEOUT_MS
value: "1000"
- name: NODE_OPTIONS
value: ' --require /otel-auto-instrumentation-nodejs/autoinstrumentation.js'
- name: OTEL_SERVICE_NAME
value: my-demo
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://opentelemetry-management-collector.my-system:4317
- name: OTEL_RESOURCE_ATTRIBUTES_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: OTEL_RESOURCE_ATTRIBUTES_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: OTEL_PROPAGATORS
value: tracecontext,baggage,b3
- name: OTEL_RESOURCE_ATTRIBUTES
value: k8s.container.name=unleash,k8s.deployment.name=my-demo,k8s.namespace.name=bifrost-unleash,k8s.node.name=$(OTEL_RESOURCE_ATTRIBUTES_NODE_NAME),k8s.pod.name=$(OTEL_RESOURCE_ATTRIBUTES_POD_NAME),k8s.replicaset.name=my-demo-b64c6d87b,service.version=v5.8.2-20240130-115753-fd5cd41
image: europe-north1-docker.pkg.dev/my-io/my/images/unleash-v4:v5.8.2-20240130-115753-fd5cd41
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 4242
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
name: unleash
ports:
- containerPort: 4242
name: http
protocol: TCP
resources:
limits:
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1001
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-xfgkk
readOnly: true
- mountPath: /otel-auto-instrumentation-nodejs
name: opentelemetry-auto-instrumentation-nodejs
- args:
- --structured-logs
- --port=5432
- my-management-233d:europe-north1:bifrost-3de70742
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.1.0
imagePullPolicy: IfNotPresent
name: sql-proxy
resources:
limits:
memory: 100Mi
requests:
cpu: 10m
memory: 100Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
runAsNonRoot: true
runAsUser: 65532
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-xfgkk
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
initContainers:
- command:
- cp
- -a
- /autoinstrumentation/.
- /otel-auto-instrumentation-nodejs
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:0.46.0
imagePullPolicy: IfNotPresent
name: opentelemetry-auto-instrumentation-nodejs
resources:
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 50m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1001
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /otel-auto-instrumentation-nodejs
name: opentelemetry-auto-instrumentation-nodejs
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-xfgkk
readOnly: true
nodeName: gke-my-management--nap-e2-standard--7ff15a1a-9bqj
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: gke.io/optimize-utilization-scheduler
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
serviceAccount: bifrost-unleash-sql-user
serviceAccountName: bifrost-unleash-sql-user
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: kube-api-access-xfgkk
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- emptyDir:
sizeLimit: 200Mi
name: opentelemetry-auto-instrumentation-nodejs
Steps to Reproduce
Enable auto-instrumentation of a nodejs application like the one above.
Expected Result
It should not crash.
Actual Result
It fails to start with the following error:
cp: can't preserve ownership of '...': Operation not permitted
Kubernetes Version
v1.28.3
Operator version
0.93.0
Collector version
latest
Environment information
Environment
Cloud: GKE
Log output
cp: can't preserve ownership of '/otel-auto-instrumentation-nodejs/./autoinstrumentation.js': Operation not permitted
cp: can't preserve ownership of '/otel-auto-instrumentation-nodejs/./autoinstrumentation.d.ts.map': Operation not permitted
...
Additional context
No response