GrafanaAgent API does not allow to limit resources of configReloader pods
What's wrong?
When grafana agent operator is deployed by a Helm chart into a namespace with a quota, the pods created by the operator fail the quota as no resources are applied to the config reloader pod spec.
The API should:
- provide the option to specify the resources for configReloader pod, or
- apply
resourcesconsistently to all pods created by the agent
Steps to reproduce
The error can be reproduced with loki helm chart, which creates a GrafanaAgent instance.
- Deploy grafana-agent-operator Helm chart into a namespace with
ResourceQuota - Deploy loki into the same namespace
- Daemonset
loki-logscreated by the grafana agent operator fails quota
System information
OVH Managed Kubernetes Service, kubernetes version 1.25.12-3
Helm charts:
- grafana/grafana-agent-operator, version: ^0.3.11
- grafana/loki, version: ^5.39.0
Helm version v3.13.2
Agent operator: docker.io/grafana/agent-operator:v0.37.4
Logs
$ kubectl describe daemonset loki-logs -n monitoring
Name: loki-logs
Selector: app.kubernetes.io/instance=loki,app.kubernetes.io/managed-by=grafana-agent-operator,app.kubernetes.io/name=grafana-agent,grafana-agent=loki,operator.agent.grafana.com/name=loki,operator.agent.grafana.com/type=logs
Node-Selector: <none>
Labels: app.kubernetes.io/instance=loki
app.kubernetes.io/managed-by=grafana-agent-operator
app.kubernetes.io/name=grafana-agent
grafana-agent=loki
operator.agent.grafana.com/name=loki
operator.agent.grafana.com/type=logs
Annotations: deprecated.daemonset.template.generation: 1
meta.helm.sh/release-name: loki
meta.helm.sh/release-namespace: monitoring
Desired Number of Nodes Scheduled: 4
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app.kubernetes.io/instance=loki
app.kubernetes.io/managed-by=grafana-agent-operator
app.kubernetes.io/name=grafana-agent
app.kubernetes.io/version=v0-37-4
grafana-agent=loki
operator.agent.grafana.com/name=loki
operator.agent.grafana.com/type=logs
Annotations: kubectl.kubernetes.io/default-container: grafana-agent
Service Account: loki-grafana-agent
Containers:
config-reloader:
Image: quay.io/prometheus-operator/prometheus-config-reloader:v0.67.1
Port: <none>
Host Port: <none>
Args:
--config-file=/var/lib/grafana-agent/config-in/agent.yml
--config-envsubst-file=/var/lib/grafana-agent/config/agent.yml
--watch-interval=1m
--statefulset-ordinal-from-envvar=POD_NAME
--reload-url=http://127.0.0.1:8080/-/reload
Environment:
POD_NAME: (v1:metadata.name)
HOSTNAME: (v1:spec.nodeName)
SHARD: 0
Mounts:
/var/lib/docker/containers from dockerlogs (ro)
/var/lib/grafana-agent/config from config-out (rw)
/var/lib/grafana-agent/config-in from config (ro)
/var/lib/grafana-agent/data from data (rw)
/var/lib/grafana-agent/secrets from secrets (ro)
/var/log from varlog (ro)
grafana-agent:
Image: grafana/agent:v0.37.4
Port: 8080/TCP
Host Port: 0/TCP
Args:
-config.file=/var/lib/grafana-agent/config/agent.yml
-config.expand-env=true
-server.http.address=0.0.0.0:8080
-enable-features=integrations-next
Limits:
cpu: 50m
memory: 64M
Requests:
cpu: 10m
memory: 32M
Readiness: http-get http://:http-metrics/-/ready delay=0s timeout=3s period=5s #success=1 #failure=120
Environment:
POD_NAME: (v1:metadata.name)
HOSTNAME: (v1:spec.nodeName)
SHARD: 0
Mounts:
/var/lib/docker/containers from dockerlogs (ro)
/var/lib/grafana-agent/config from config-out (rw)
/var/lib/grafana-agent/config-in from config (ro)
/var/lib/grafana-agent/data from data (rw)
/var/lib/grafana-agent/secrets from secrets (ro)
/var/log from varlog (ro)
Volumes:
config:
Type: Secret (a volume populated by a Secret)
SecretName: loki-logs-config
Optional: false
config-out:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
secrets:
Type: Secret (a volume populated by a Secret)
SecretName: loki-secrets
Optional: false
varlog:
Type: HostPath (bare host directory volume)
Path: /var/log
HostPathType:
dockerlogs:
Type: HostPath (bare host directory volume)
Path: /var/lib/docker/containers
HostPathType:
data:
Type: HostPath (bare host directory volume)
Path: /var/lib/grafana-agent/data
HostPathType:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 28m daemonset-controller Error creating: pods "loki-logs-xhtj9" is forbidden: failed quota: monitoring.quota: must specify limits.cpu for: config-reloader; limits.memory for: config-reloader; requests.cpu for: config-reloader; requests.memory for: config-reloader
Adding output from $ kubectl -n monitoring describe grafanaagent loki:
Name: loki
Namespace: monitoring
Labels: app.kubernetes.io/instance=loki
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=loki
app.kubernetes.io/version=2.9.2
helm.sh/chart=loki-5.39.0
Annotations: meta.helm.sh/release-name: loki
meta.helm.sh/release-namespace: monitoring
API Version: monitoring.grafana.com/v1alpha1
Kind: GrafanaAgent
Metadata:
Creation Timestamp: 2023-11-30T20:44:42Z
Generation: 2
Resource Version: 3040719368
UID: 5bedee00-a8c7-4aba-a963-41be76adaebb
Spec:
Disable Reporting: false
Disable Support Bundle: false
Enable Config Read API: false
Logs:
Instance Selector:
Match Labels:
app.kubernetes.io/instance: loki
app.kubernetes.io/name: loki
Resources:
Limits:
Cpu: 50m
Memory: 64M
Requests:
Cpu: 10m
Memory: 32M
Service Account Name: loki-grafana-agent
Events: <none>
This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!
Just to clarify, this only applies to the Operator in static mode; the Agent's own helm chart allows defining resources on the configReloader pod normally.
This bothers me too, it would be great to be able to define resources for config-reloader too.
As a workaround, one can create a LimitRange with default and defaultRequest limits for the namespace, but it requires permissions.
This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!