resultsCache is not used
Describe the bug resultsCache is not used
To Reproduce Steps to reproduce the behavior:
- Loki version 6.27.0
- Promtail version 6.15.3
Expected behavior resultsCache is used
Environment:
- Infrastructure: Kubernetes
- Deployment tool: helm
Screenshots, Promtail config, or terminal output
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
name: grafana
namespace: loki-core
spec:
interval: 24h
url: https://grafana.github.io/helm-charts
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: loki
namespace: loki-core
spec:
interval: 30m
install:
remediation:
retries: 3
chart:
spec:
chart: loki
version: 6.27.0
sourceRef:
kind: HelmRepository
name: grafana
values:
deploymentMode: Distributed
global:
image:
registry: "harbor.corp/dockerhub"
.globalExtra: &globalExtra
extraArgs:
- "-config.expand-env=true"
extraEnv:
- name: S3_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: loki-s3
key: S3_ACCESS_KEY_ID
- name: S3_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: loki-s3
key: S3_SECRET_ACCESS_KEY
tolerations:
- key: "role"
operator: "Equal"
value: "loki"
effect: "NoSchedule"
nodeSelector:
role: loki
gateway:
enabled: false
ingress:
enabled: true
ingressClassName: nginx
annotations:
cert-manager.io/cluster-issuer: "cluster-issuer"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
hosts:
- loki-core-v2.corp
serviceMonitor:
enabled: true
compactor:
enabled: true
replicas: 1
<<: *globalExtra
persistence:
enabled: true
claims:
- name: data
size: 50Gi
storageClass: yc-network-ssd
resources:
requests:
memory: "2Gi"
cpu: "800m"
limits:
memory: "2Gi"
distributor:
<<: *globalExtra
replicas: 1
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 7
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 70
resources:
requests:
memory: "260Mi"
cpu: "300m"
limits:
memory: "260Mi"
maxUnavailable: 1
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution: []
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 99
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: distributor
topologyKey: kubernetes.io/hostname
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: distributor
topologyKey: topology.kubernetes.io/zone
ingester:
<<: *globalExtra
replicas: 8
persistence:
enabled: true
claims:
- name: data
size: 50Gi
storageClass: yc-network-ssd
autoscaling:
enabled: false
resources:
requests:
memory: "5Gi"
cpu: "600m"
limits:
memory: "10Gi"
maxUnavailable: 1
zoneAwareReplication:
zoneA:
nodeSelector:
topology.kubernetes.io/zone: ru-central1-a
role: loki
zoneB:
nodeSelector:
topology.kubernetes.io/zone: ru-central1-b
role: loki
zoneC:
nodeSelector:
topology.kubernetes.io/zone: ru-central1-d
role: loki
indexGateway:
<<: *globalExtra
replicas: 3
enabled: true
persistence:
enabled: true
storageClass: yc-network-ssd
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
maxUnavailable: 1
querier:
<<: *globalExtra
autoscaling:
enabled: true
minReplicas: 7
maxReplicas: 15
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 70
resources:
requests:
memory: "3Gi"
cpu: "2500m"
limits:
memory: "5Gi"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution: []
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 99
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: querier
topologyKey: kubernetes.io/hostname
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: querier
topologyKey: topology.kubernetes.io/zone
maxUnavailable: 1
queryFrontend:
<<: *globalExtra
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 4
targetCPUUtilizationPercentage: 60
targetMemoryUtilizationPercentage: 60
resources:
requests:
memory: "1024Mi"
cpu: "100m"
limits:
memory: "1024Mi"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution: []
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 99
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: query-frontend
topologyKey: kubernetes.io/hostname
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: query-frontend
topologyKey: topology.kubernetes.io/zone
maxUnavailable: 1
queryScheduler:
<<: *globalExtra
replicas: 1
loki:
auth_enabled: false
storage:
type: s3
bucketNames:
chunks: core-loki
ruler: core-loki
admin: core-loki
s3:
endpoint: https://storage.cloud.net:443/
region: ru-central1
bucketnames: core-loki
secretAccessKey: "${S3_SECRET_ACCESS_KEY}"
accessKeyId: "${S3_ACCESS_KEY_ID}"
tsdb_shipper:
shared_store: s3
active_index_directory: /var/loki/tsdb-index
cache_location: /var/loki/tsdb-cache
schemaConfig:
configs:
- from: "2020-01-01"
store: tsdb
object_store: s3
schema: v12
index:
prefix: tsdb_index_
period: 24h
distributor:
ring:
kvstore:
store: memberlist
ingester:
lifecycler:
ring:
kvstore:
store: memberlist
replication_factor: 1
autoforget_unhealthy: true
chunk_idle_period: 1h
chunk_target_size: 1572864
max_chunk_age: 1h
chunk_encoding: snappy
server:
grpc_server_max_recv_msg_size: 4194304
grpc_server_max_send_msg_size: 4194304
limits_config:
allow_structured_metadata: false
reject_old_samples: true
reject_old_samples_max_age: 168h
max_cache_freshness_per_query: 10m
split_queries_by_interval: 15m
retention_period: 30d
max_global_streams_per_user: 0
ingestion_rate_mb: 30
query_timeout: 300s
volume_enabled: true
memcached:
chunk_cache:
enabled: true
results_cache:
enabled: true
rulerConfig:
remote_write:
enabled: true
clients:
vmagent_local:
url: http://vmagent-victoria-metrics-k8s-stack.victoria-metrics.svc:8429/api/v1/write?extra_label=cluster=infra
ruler:
enabled: true
replicas: 1
<<: *globalExtra
extraArgs:
- "-config.expand-env=true"
# loki chart set {{- define "loki.rulerStorageConfig" -}} from global .Values.loki.storage.s3
- "-ruler.storage.type=local"
- "-ruler.storage.local.directory=/etc/loki/rules"
storage:
type: local
local:
directory: /etc/loki/rules
persistence:
enabled: true
size: 10Gi
ring:
kvstore:
store: memberlist
directories:
fake:
rules.txt: |
groups:
- name: logs2metrics
interval: 1m
rules:
- record: panic_count
expr: |
sum by (app,cluster) (
count_over_time({app=~".+",level!~"INFO|info|debug"}|="panic"[5m])
)
- record: logs_size_by_app
expr: |
sum by (app,cluster)(
bytes_over_time({app=~".+"}[1m])
)
resultsCache:
allocatedMemory: 4096
chunksCache:
allocatedMemory: 12288
bloomCompactor:
replicas: 0
bloomGateway:
replicas: 0
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0
singleBinary:
replicas: 0
test:
enabled: false
lokiCanary:
enabled: false
monitoring:
serviceMonitor:
enabled: true
rules:
enabled: true
chunksCache:
chunk_store_config:
chunk_cache_config:
background:
writeback_buffer: 500000
writeback_goroutines: 1
writeback_size_limit: 500MB
default_validity: 0s
memcached:
batch_size: 4
parallelism: 5
memcached_client:
addresses: dnssrvnoa+_memcached-client._tcp.loki-chunks-cache.loki-core.svc
consistent_hash: true
max_idle_conns: 72
timeout: 2000ms
and resultsCache:
results_cache:
cache:
background:
writeback_buffer: 500000
writeback_goroutines: 1
writeback_size_limit: 500MB
default_validity: 12h
memcached_client:
addresses: dnssrvnoa+_memcached-client._tcp.loki-results-cache.loki-core.svc
consistent_hash: true
timeout: 500ms
update_interval: 1m
Why chunksCache use memcached, but resultsCache use memcached_client? Who writes in resultsCache? How can I enable read-write logs in memcached?
@patsevanton Having same issue, thank you for opening this. Also nice dashboarding. Mind sharing where you got it from or the JSON? Thanks
@earimont-ib I think this dashboard is a simple memcached dashboard from the Internet.
@patsevanton Found it, thank you.
I'm observing the same. First noticed with chart version 6.24.1 (Loki 3.3.2), updated now to chart version 6.30.1 (Loki 3.5.0). Here it's SimpleScalable deployment.
Update: By digging around I found https://github.com/grafana/loki/blob/5af3150aace25384cb4e8fd2e9800900f9aaf4b3/pkg/querier/queryrange/log_result_cache.go#L53
which means only negativ queries can be cached in the result cache. The associated metric should be loki_query_frontend_log_result_cache_hit_total. According to that I've cache hits, though at a very low level. So maybe it's not a bug after all but just a misunderstanding of what can be cached in that specific cache.
is there any update regarding this? why the results cache is not being used?
Having the same problem here any update.