loki icon indicating copy to clipboard operation
loki copied to clipboard

Loki 3.0.0 doesnt support external Memcached Clusters

Open elburnetto-intapp opened this issue 1 year ago • 5 comments

Describe the bug When deploying Loki 3.0.0, it allows you to specify in Helm details of an external Memcached cluster:

  memcached:
    chunk_cache:
      enabled: true
      host: memcached.monitoring.svc.cluster.local
      service: "memcache"
      batch_size: 256
      parallelism: 10
    results_cache:
      enabled: true
      host: memcached.monitoring.svc.cluster.local
      service: "memcache"
      timeout: "500ms"
      default_validity: "12h"

However, when you look at the Loki Config Template, the only references for Memcache are if you use the Distributed Mode Memcaches:

    query_range:
      align_queries_with_step: true
      {{- with .Values.loki.query_range }}
      {{- tpl (. | toYaml) $ | nindent 4 }}
      {{- end }}
      {{- if .Values.resultsCache.enabled }}
      {{- with .Values.resultsCache }}
      cache_results: true
      results_cache:
        cache:
          default_validity: {{ .defaultValidity }}
          background:
            writeback_goroutines: {{ .writebackParallelism }}
            writeback_buffer: {{ .writebackBuffer }}
            writeback_size_limit: {{ .writebackSizeLimit }}
          memcached_client:
            consistent_hash: true
            addresses: dnssrvnoa+_memcached-client._tcp.{{ template "loki.fullname" $ }}-results-cache.{{ $.Release.Namespace }}.svc
            timeout: {{ .timeout }}
            update_interval: 1m
      {{- end }}
      {{- end }}

To Reproduce Steps to reproduce the behavior:

  1. Configure the Helm Chart to use external Memcached cluster
  2. Review the Loki Config Map (no Memcache details appear)
  3. Review the Loki Memcached Metrics exported via /metrics (they no longer increase)

Expected behavior We should be able to use our own Memcached clusters, and the Helm Chart should accommodate for this.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: helm

elburnetto-intapp avatar Apr 10 '24 15:04 elburnetto-intapp

You can use external memcached with SSD mode on v3.

The key things we set (for elasticache):

loki:
  memcached:
    chunk_cache:
      enabled: false
    results_cache:
      enabled: false

  storage_config:
    index_queries_cache_config:
      memcached:
        batch_size: 1024
        parallelism: 100
      memcached_client:
        addresses: "loki-results.0001.use1.cache.amazonaws.com:11211,loki-results.0002.use1.cache.amazonaws.com:11211"
        timeout: 5000ms
        max_idle_conns: 64
        max_item_size: 0
        consistent_hash: true

  structuredConfig:
    query_range:
      align_queries_with_step: true
      cache_results: true
      max_retries: 30
      cache_index_stats_results: true
      cache_volume_results: true
      cache_instant_metric_results: true
      instant_metric_query_split_align: true
      cache_series_results: true
      cache_label_results: true
      results_cache:
        compression: snappy
        cache:
          background:
            writeback_buffer: 500000
            writeback_goroutines: 1
            writeback_size_limit: 500MB
          default_validity: 12h
          memcached:
            batch_size: 1024
            parallelism: 100
          memcached_client:
            addresses: "loki-results.0001.use1.cache.amazonaws.com:11211,loki-results.0002.use1.cache.amazonaws.com:11211"
            timeout: 5000ms
            max_idle_conns: 64
            max_item_size: 0
            consistent_hash: true
      
    chunk_store_config:
      chunk_cache_config:
        background:
          writeback_buffer: 500000
          writeback_goroutines: 1
          writeback_size_limit: 500MB
        default_validity: 0s
        memcached:
          batch_size: 1024
          parallelism: 100
        memcached_client:
          addresses: "loki-chunk.0001.use1.cache.amazonaws.com:11211,loki-chunk.use1.cache.amazonaws.com:11211,loki-chunk.0003.use1.cache.amazonaws.com:11211"
          timeout: 5000ms
          max_idle_conns: 64
          max_item_size: 0
          consistent_hash: true
      write_dedupe_cache_config:
        memcached:
          batch_size: 1024
          parallelism: 100
        memcached_client:
          addresses: "loki-results.0002.use1.cache.amazonaws.com:11211"
          timeout: 5000ms
          max_idle_conns: 64
          max_item_size: 0
          consistent_hash: true

So the positions of some of the config we moved to StructuredConfig to get it to work as well as

rknightion avatar Apr 11 '24 19:04 rknightion

@rknightion The issue is more around the Helm chart not supporting External Memcached clusters (after we did some amending of the values file, we got it working with our external Memcached so all is well). Just flagging as it was a breaking change when we jumped up to V3, as it stopped using Memcached until we saw it on our metrics.

elburnetto-intapp avatar Apr 12 '24 08:04 elburnetto-intapp

I came up with the same conclusion.

Even though there are significant differences between the official Loki-cache guide https://grafana.com/docs/loki/latest/operations/caching/ and the default chart settings.

tkcontiant avatar Jun 25 '24 13:06 tkcontiant

We're seeing the same problem trying to upgrade from 2.8.x (chart 5.7.4 to 6.7.1)

gybanez avatar Jul 18 '24 22:07 gybanez

Yeah, also having this issue. Makes a safe upgrade tricky.

@elburnetto-intapp can you explain the changes you made when you say "amending of the values file"? How did you get around this?

KA-ROM avatar Aug 20 '24 10:08 KA-ROM

I have hit the issue where we can't enable chunk & result caching using external memcache lcuster, and came-up with a this diff to fix it https://github.com/grafana/loki/pull/17432

abenbachir avatar May 01 '25 17:05 abenbachir