opentelemetry-operator
opentelemetry-operator copied to clipboard
Support resources (Requests and Limits) for auto-instrumentation initContainers
An auto-instrumented Pod may fail to deploy or force a scale-up because no resource requests or limits are set on the injected initContainer, so the namespace defaults are used. These are generally grossly excessive for a trivial initContainer, and will force the pod to reserve more than it needs.
This will affect all initContainers due to k8s's resources rules for initContainers: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resources . And critically:
Scheduling is done based on effective requests/limits, which means init containers can reserve resources for initialization that are not used during the life of the Pod.
so the otel autoinstrumentation is likely reserving node resources that cannot then be used.
Additionally, not specifying limits can prevent the container from deploying if a limitRange defines a cap on container's request and limits.
The initContainer injected by the Instrumentation resource by request of the instrumentation.opentelemetry.io/inject-nodejs: true pod annotation should default to some very low resource limits for CPU and memory. The Instrumentation CRD probably needs to make them configurable, since no doubt someone will find a problem with any sensible low default, somehow, somewhere.
If you get deployment delays like
Warning FailedScheduling 38s default-scheduler 0/25 nodes are available: 22 Insufficient cpu, 3 node(s) had taint {CriticalAddonsOnly: true}, that the pod didn't tolerate.
or related errors, this is probably why.
Sample initContainer from kubectl describe pod:
Init Containers:
...
opentelemetry-auto-instrumentation:
Image: myregistry/autoinstrumentation-nodejs@sha256:2679c21711ba8f829a48144ea458c537dd1cdd54f568ff33d609596321b436f7
Port: <none>
Host Port: <none>
Command:
cp
-a
/autoinstrumentation/.
/otel-auto-instrumentation/
Environment: <none>
Mounts:
/otel-auto-instrumentation from opentelemetry-auto-instrumentation (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8ggbs (ro)
Note the lack of any Limits or Requests section. For comparison, here's an Istio proxyv2 initContainer's settings:
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 100m
memory: 128Mi
I don't see a practical way to patch this in my deployments because the container spec is generated by the operator on the fly.
This can be worked around by setting a very low default resource limit and request for containers in the target namespace(s), then ensuring all other containers have explicit requests and limits. But that's very cumbersome and intrusive for something meant to be transparent.
I don't know how to go about patching the CRD to add it. I can have a go at a patch to add some "sensible defaults" to the operator though. Maybe env-var based configuration at the operator level would be sufficient control anyway?
A workaround can be to deploy a very low LimitRange to your namespace, then make sure everything else specifies limits and requests explicitly.
E.g.
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: my-namespace-here
spec:
limits:
- default:
cpu: 100m
memory: 100Mi
defaultRequest:
cpu: 500m
memory: 250Mi
type: Container
There's another use case where setting resources in the initContainer is necessary. If we set ResourceQuotas in our cluster, auto-instrumented pods are not created (blocked at ReplicaSet level) due to missing resources requests/limits.