loki
loki copied to clipboard
AWS region is set to `dummy`
Describe the bug
AWS region is not taken into account at least for loki-backend pods when trying to access AWS STS. This is throwing continuously error messages (see log in output section). Apart from these error messages, everything is working as expected though.
After looking into the code (as non-native Go speaker) the culprit seems to lie around the lines 224-232 and 245-247 of s3_storage_client.go where the region should be set into the s3Config object.
To Reproduce Steps to reproduce the behavior:
- Deploy Loki (2.9.3) via Helm chart (5.41.0)
- look into log of a
loki-backendpod
Expected behavior
The used endpoint is not filled with dummy region and thus doesn't throw an error.
Environment:
- Infrastructure: Kubernetes on AWS
- Deployment tool: helm via helmfile
Screenshots, Promtail config, or terminal output
Loki log:
level=info ts=2023-12-12T10:44:11.495384847Z caller=loki.go:505 msg="Loki started"
level=error ts=2023-12-12T10:44:11.505356978Z caller=ruler.go:571 msg="unable to list rules" err="WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.dummy.amazonaws.com/\": dial tcp: lookup sts.dummy.amazonaws.com on 172.20.0.10:53: no such host"
Loki helm values:
loki:
auth_enabled: false
commonConfig:
path_prefix: /var/loki
replication_factor: 3
compactor:
apply_retention_interval: 1h
compaction_interval: 5m
retention_delete_worker_count: 500
retention_enabled: true
shared_store: s3
schemaConfig:
configs:
- from: 2018-04-15
store: boltdb-shipper
object_store: s3
schema: v11
index:
prefix: loki_index_
period: 24h
server:
http_listen_port: 3100
storage_config:
boltdb_shipper:
active_index_directory: /var/loki/index
cache_location: /var/loki/index_cache
shared_store: s3
aws:
bucketnames: {{ .Values.loki.bucket_name }}
region: {{ .Values.aws.region }}
s3forcepathstyle: false
serviceAccount:
create: true
name: loki
annotations:
eks.amazonaws.com/role-arn: {{ .Values.loki.s3_access_role }}
Screenshot of the applied env vars:
im seeing this as well but on 2.9.1
Sorry im not using loki helm but i figured it out.
The config must have the region set here when using IRSA
common:
compactor_address: 'loki'
path_prefix: /var/loki
replication_factor: 2
storage:
s3:
bucketnames: {{ .Values.s3_bucket }}
region: {{ .Values.region }}
Mines working now. Env vars did not matter for whatever reason.
Sorry im not using loki helm but i figured it out.
The config must have the region set here when using IRSA
common: compactor_address: 'loki' path_prefix: /var/loki replication_factor: 2 storage: s3: bucketnames: {{ .Values.s3_bucket }} region: {{ .Values.region }}Mines working now. Env vars did not matter for whatever reason.
Thank you for your response. I tried your suggestion and put the storage block into the commonConfig block in my config, but unfortunately the issue is still the same.
I have the same issue
I'm facing the same issue any update on this?
I had the same error message, deploying via helm and loki as singleBinary. After adding the list-element "ruler: BUCKET_NAME" it disapeared
# values.yaml
loki:
..
storage:
bucketNames:
chunks: BUCKET_NAME
ruler: BUCKET_NAME
type: s3
s3:
s3: s3://BUCKET_NAME
region: "eu-central-1"
accessKeyId: "${GRAFANA_LOKI_S3_ACCESKEYID}"
secretAccessKey: "${GRAFANA_LOKI_S3_SECRETACCESSKEY}"
s3ForcePathStyle: false
insecure: false
https://grafana.com/docs/loki/latest/setup/install/helm/install-monolithic/
Which makes sense, since the helm chart's _helpers.tpl is looking for $.Values.loki.storage.bucketNames.ruler
https://github.com/grafana/loki/blob/main/production/helm/loki/templates/_helpers.tpl#L342
hello 0xdnL, thank you for this hint. I was now able to test your suggestion, but unfortunately the error persists. This is my change that i tried (among several variations):
commonConfig:
path_prefix: /var/loki
replication_factor: 3
storage:
bucketNames:
ruler: {{ .Values.loki.bucket_name }}
chunks: {{ .Values.loki.bucket_name }}
type: s3
s3:
s3: {{ .Values.loki.bucket_name }}
region: {{ .Values.aws.region }}
s3forcepathstyle: false
Can someone from grafana add a definitive working exemple values.yaml in exemple directory for distributed loki with S3 backend using IRSA ?
small update: we upgraded to loki 3.0.0 via helm chart version 6.3.3 and the error still persists.
Can confirm this is still the case as of Chart v6.6.4:
init compactor: failed to init delete store: failed to get s3 object: WebIdentityErr: failed to retrieve credentials
caused by: RequestError: send request failed
caused by: Post "https://sts.dummy.amazonaws.com/": 3 errors occurred:
* dial tcp: lookup sts.dummy.amazonaws.com on 172.20.0.10:53: no such host
* dial tcp: lookup sts.dummy.amazonaws.com on 172.20.0.10:53: no such host
* dial tcp: lookup sts.dummy.amazonaws.com on 172.20.0.10:53: no such host