helm-charts
helm-charts copied to clipboard
[loki-distributed] memcache get multi failing
I'm having this issue on loki-distributed-querier
level=warn ts=2022-03-25T12:05:36.385437066Z caller=memcached_client.go:228 msg="error updating memcache servers" err="lookup _memcache._tcp.loki-distributed.cfg.euw1.cache.amazonaws.com:11211: no such host"
level=error ts=2022-03-25T14:03:42.277835662Z caller=memcached.go:224 msg="failed to put to memcached" name=store.index-cache-read. err="server=10.5.101.163:11211: read tcp 10.5.110.131:42866->10.5.101.163:11211: i/o timeout"
ts=2022-03-25T14:03:42.784573393Z caller=spanlogger.go:87 org_id=fake method=SeriesStore.lookupLabelNamesBySeries level=debug queries=23252
level=debug ts=2022-03-25T14:03:42.877727452Z caller=logging.go:67 traceID=0389dc6226f7835c orgID=fake msg="GET /ready (200) 36.291µs"
ts=2022-03-25T14:03:43.577379945Z caller=spanlogger.go:87 org_id=fake method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.5.110.131:44384->10.5.101.163:11211: i/o timeout"
I'm using aws memcached and there is access from the cluster but still have the lookup errors and then the get multi error, please help
anyone?
Same here:
ts=2022-05-19T15:58:02.97373921Z caller=spanlogger.go:87 org_id=cloud method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.37.139.120:34056->10.37.142.107:11211: i/o timeout" ts=2022-05-19T15:58:02.974094881Z caller=spanlogger.go:87 org_id=cloud method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.37.139.120:39152->10.37.139.184:11211: i/o timeout" ts=2022-05-19T15:58:02.975415578Z caller=spanlogger.go:87 org_id=cloud method=Memcache.GetMulti level=error msg="Failed to get keys from memcached" err="read tcp 10.37.139.120:39444->10.37.139.184:11211: i/o timeout"
I increase the value of memcached_client.timeout to 1000ms but the same.
I did a telnet from querier container to memcached container on port 11211 and worked fine.
Can you help us?
Hi, I also see the same error, Has anyone found a solution to this issue in loki-distributed helm chart ?
same
https://github.com/cortexproject/cortex/issues/1505#issuecomment-808428354
Same for me. I'm using addressess params. So the cause not comes from SRV DNS lookup.
I read https://community.grafana.com/t/memcached-config-in-k8s-distributed-model/52042/4
How set Memcached address for querier ?
Helm manifest have code:
query_range:
align_queries_with_step: true
cache_results: true
max_retries: 5
results_cache:
cache:
memcached_client:
addresses: dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
consistent_hash: true
But
host -t SRV dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Host dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local not found: 3(NXDOMAIN)
and
nslookup -q=SRV dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Server: 10.236.181.2
Address: 10.236.181.2#53
** server can't find dnssrv+_memcached-client._tcp.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local: NXDOMAIN
But
nslookup loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Server: 10.236.181.2
Address: 10.236.181.2#53
Name: loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Address: 10.236.165.152
telnet loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local 11211
Connected to loki-loki-distributed-memcached-frontend-0.loki-loki-distributed-memcached-frontend.loki.svc.cluster.local
Check ip
kubectl get pod -n loki -o wide | grep 10.236.165.152
loki-loki-distributed-memcached-frontend-0 1/1 Running 0 3h10m 10.236.165.152 cl18384mb5q6n2sbqupr-ogyt <none> <none>