helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

[loki-distributed] memcache get multi failing

Open rotarur opened this issue 3 years ago • 6 comments

I'm having this issue on loki-distributed-querier

level=warn ts=2022-03-25T12:05:36.385437066Z caller=memcached_client.go:228 msg="error updating memcache servers" err="lookup _memcache._tcp.loki-distributed.cfg.euw1.cache.amazonaws.com:11211: no such host"
level=error ts=2022-03-25T14:03:42.277835662Z caller=memcached.go:224 msg="failed to put to memcached" name=store.index-cache-read. err="server=10.5.101.163:11211: read tcp 10.5.110.131:42866->10.5.101.163:11211: i/o timeout"
ts=2022-03-25T14:03:42.784573393Z caller=spanlogger.go:87 org_id=fake method=SeriesStore.lookupLabelNamesBySeries level=debug queries=23252
level=debug ts=2022-03-25T14:03:42.877727452Z caller=logging.go:67 traceID=0389dc6226f7835c orgID=fake msg="GET /ready (200) 36.291µs"
ts=2022-03-25T14:03:43.577379945Z caller=spanlogger.go:87 org_id=fake method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.5.110.131:44384->10.5.101.163:11211: i/o timeout"

I'm using aws memcached and there is access from the cluster but still have the lookup errors and then the get multi error, please help

rotarur avatar Mar 25 '22 14:03 rotarur

anyone?

rotarur avatar Apr 12 '22 08:04 rotarur

Same here:

ts=2022-05-19T15:58:02.97373921Z caller=spanlogger.go:87 org_id=cloud method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.37.139.120:34056->10.37.142.107:11211: i/o timeout" ts=2022-05-19T15:58:02.974094881Z caller=spanlogger.go:87 org_id=cloud method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.37.139.120:39152->10.37.139.184:11211: i/o timeout" ts=2022-05-19T15:58:02.975415578Z caller=spanlogger.go:87 org_id=cloud method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.37.139.120:39444->10.37.139.184:11211: i/o timeout"

I increase the value of memcached_client.timeout to 1000ms but the same.

I did a telnet from querier container to memcached container on port 11211 and worked fine.

Can you help us?

mbonanata avatar May 19 '22 16:05 mbonanata

Hi, I also see the same error, Has anyone found a solution to this issue in loki-distributed helm chart ?

himeshkothari avatar Jun 27 '22 09:06 himeshkothari

same

lucenabruno avatar Sep 02 '22 10:09 lucenabruno

https://github.com/cortexproject/cortex/issues/1505#issuecomment-808428354

lucenabruno avatar Sep 02 '22 10:09 lucenabruno

Same for me. I'm using addressess params. So the cause not comes from SRV DNS lookup.

Bigouden avatar Sep 23 '22 08:09 Bigouden

I read https://community.grafana.com/t/memcached-config-in-k8s-distributed-model/52042/4

How set Memcached address for querier ?

Helm manifest have code:

    query_range:
      align_queries_with_step: true
      cache_results: true
      max_retries: 5
      results_cache:
        cache:
          memcached_client:
            addresses: dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
            consistent_hash: true

But

host -t SRV    dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Host dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local not found: 3(NXDOMAIN)

and

nslookup  -q=SRV   dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Server:		10.236.181.2
Address:	10.236.181.2#53

** server can't find dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local: NXDOMAIN

But

nslookup loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Server:		10.236.181.2
Address:	10.236.181.2#53

Name:	loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Address: 10.236.165.152


telnet loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local 11211
Connected to loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local

Check ip

kubectl get pod -n loki -o wide | grep 10.236.165.152
loki-loki-distributed-memcached-frontend-0              1/1     Running   0                3h10m   10.236.165.152   cl18384mb5q6n2sbqupr-ogyt   <none>           <none>

patsevanton avatar Dec 13 '22 11:12 patsevanton