loki
loki copied to clipboard
[Helm] Helm test requires self monitoring to be enabled
Describe the bug
I cannot create a helm template with loki with version > 3.2.2
. As this is the way ArgoCD deploys applications, I cannot deply Loki with Chart version > 3.2.2
using Helm
E.g.:
Chart.yaml
apiVersion: v2
name: loki
version: 3.3.2
dependencies:
- name: loki
version: 3.3.2
repository: https://grafana.github.io/helm-charts
values.yaml
loki:
loki:
auth_enabled: false
schemaConfig:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: gcs
schema: v12
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /var/loki/index
cache_location: /var/loki/boltdb-cache
cache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk space
shared_store: gcs
gcs:
bucket_name: loki
storage:
bucketNames:
chunks: loki_chunks
ruler: loki_ruler
admin: loki_admin
type: gcs
memcached:
chunk_cache:
enabled: true
host: "memcached-loki.loki"
service: memcache
batch_size: 1024
parallelism: 100
results_cache:
enabled: true
host: "memcached-loki.loki"
service: memcache
timeout: "500ms"
default_validity: "12h"
rulerConfig:
storage:
type: local
local:
directory: "/tmp/rules"
rule_path: /tmp/scratch
alertmanager_url: http://prometheus-infra-alertmanager.prometheus:80
ring:
kvstore:
store: inmemory
enable_api: true
enable_alertmanager_v2: true
# ---------------------
# This section below is added because loki sometimes throws an error "too many outstanding requests", see https://github.com/grafana/loki/issues/4613
# This should solve that
query_scheduler:
max_outstanding_requests_per_tenant: 2048
limits_config:
max_query_series: 5000
rules:
additionalGroups:
- name: additional-loki-rules
rules:
- record: job:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job)
- record: job_route:loki_request_duration_seconds_bucket:sum_rate
expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, job, route)
- record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
expr: sum(rate(container_cpu_usage_seconds_total[1m])) by (node, namespace, pod, container)
selfMonitoring:
enabled: false
ingress:
# We use the Gateway
enabled: false
read:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
persistence:
storageClass: premium-rwo
write:
nodeSelector:
iam.gke.io/gke-metadata-server-enabled: "true"
persistence:
storageClass: premium-rwo
monitoring:
selfMonitoring:
enabled: false
gateway:
enabled: true
autoscaling:
enabled: true
maxReplicas: 5
ingress:
enabled: true
hosts:
- host: "loki.xxx.com"
paths:
- path: /
pathType: ImplementationSpecific
tls:
- hosts:
- loki.xxx.com
secretName: tls-loki
ingressClassName: nginx
To Reproduce Steps to reproduce the behavior:
- Place the
Chart.yaml
andvalues.yaml
in a folder. - Run
helm dependency build && helm template --debug . -f values.yaml > all.yaml && rm -rf Chart.lock charts
- If version in
Chart.yaml
is >3.2.2
it will fail with:
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Downloading loki from repo https://grafana.github.io/helm-charts
Deleting outdated charts
install.go:173: [debug] Original chart version: ""
install.go:190: [debug] CHART PATH: /home/xxx//loki
Error: template: loki/charts/loki/templates/validate.yaml:12:4: executing "loki/charts/loki/templates/validate.yaml" at <fail "Helm test requires self monitoring to be enabled">: error calling fail: Helm test requires self monitoring to be enabled
helm.go:81: [debug] template: loki/charts/loki/templates/validate.yaml:12:4: executing "loki/charts/loki/templates/validate.yaml" at <fail "Helm test requires self monitoring to be enabled">: error calling fail: Helm test requires self monitoring to be enabled
Expected behavior
When the version is 3.2.2
or below, it creates a file called all.yaml
with the whole manifest of loki. This can be deployed using kubectl apply -f all.yaml
Environment:
- Infrastructure: kubernetes
- Deployment tool: helm
I'm not sure what helm test does (or where to read about it), but if you are disabling selfMonitoring, maybe you should also disable tests?
test:
enabled: false
We're hitting this as well - the 'solution' is to disabled 'test' as @AurimasNav says, but it feels a bit wrong.
If the test relies on:
selfMonitoring:
enabled: true
Then shouldn't that value being set to false also diable that specific test?
I'm not sure what helm test does (or where to read about it), but if you are disabling selfMonitoring, maybe you should also disable tests?
test: enabled: false
Disabling validation checks should not be the solution there. The Loki chart providers would need to make the self monitoring more configurable...
I mean why is the chart delivering Prometheus CRDs... srsly
Any update here?
I ran into this same issue with Loki Helm chart 5.5.2 (Loki version 2.8.2).
The CRD's from Loki helm chart are conflicting with the CRD's installed by kube-prometheus-stack, causing a race condition if they're both applied at the same time.
I've disabled the CRD's from Loki by setting monitoring.selfmonitoring.grafanaAgent.installOperator: false
but with selfMonitoring.enabled: true
(default) it fails to apply the chart because these CRD's are required:
monitoring.grafana.com/v1alpha1/PodLogs
monitoring.grafana.com/v1alpha1/GrafanaAgent
monitoring.grafana.com/v1alpha1/LogsInstance
Since Prometheus can monitor Loki, I figured it is safe to set selfMonitoring.enabled: false
, but now I receive the error that others have mentioned (loki/templates/validate.yaml:6:4): Helm test requires self monitoring to be enabled
. I get this error when using the most recent Loki chart version, 5.47.2
Edit: It looks like the only helm test implemented is based on the Loki canary which is part of the self-monitoring: https://github.com/grafana/loki/blob/main/production/helm/loki/templates/tests/test-canary.yaml