loki icon indicating copy to clipboard operation
loki copied to clipboard

3.6: region requirement when using s3 in compactor

Open danbka33 opened this issue 2 weeks ago • 2 comments

Describe the bug

base values.yaml:

# @default -- See values.yaml
loki:
 config.
  storage:
    # Loki requires a bucket for chunks and the ruler. GEL requires a third bucket for the admin API.
    # Please provide these values if you are using object storage.
    bucketNames:
      chunks: ${LOKI_S3_BUCKET_NAME}
      ruler: ${LOKI_S3_BUCKET_NAME}
      admin: ${LOKI_S3_BUCKET_NAME}
    type: s3
    s3:
      s3: "s3://${LOKI_S3_BUCKET_NAME}"
      endpoint: "http://${LOKI_S3_BUCKET_HOST}:${LOKI_S3_BUCKET_PORT}"
      region: "${LOKI_S3_BUCKET_REGION}"
      secretAccessKey: "${LOKI_S3_SECRET_ACCESS_KEY}"
      accessKeyId: "${LOKI_S3_ACCESS_KEY_ID}"
      signatureVersion: "v4"
      s3ForcePathStyle: true
      insecure: true
      http_config: {}

additional production values.yaml:

loki:
  image:
    tag: 3.6.2
  limits_config:
    retention_period: 60d
  compactor:
    working_directory: /var/loki/compactor
    compaction_interval: 30m
    retention_enabled: true
    retention_delete_delay: 2h
    delete_request_store: s3

I get error with crash:

failed to init delete store: failed to get s3 object: operation error S3: GetObject, failed to resolve service endpoint, endpoint rule error, A region must be set when sending requests to S3

But if you don't enable retention, Loki handles empty regions in the s3 settings well.

I pass bucket configuration via generated configmap from cluster ceph operator.

apiVersion: v1
data:
  BUCKET_HOST: rook-ceph-rgw-ceph-objectstore.rook-ceph.svc
  BUCKET_NAME: NDA-monitoring-loki-bu-afa1808b-ab32-400e-b84e-662b70423daa
  BUCKET_PORT: "80"
  BUCKET_REGION: ""
  BUCKET_SUBREGION: ""
kind: ConfigMap
metadata:
  name: NDA-monitoring-loki-bucket

To Reproduce Steps to reproduce the behavior:

  1. Started Loki 3.6.2
  2. Enable retention
  3. Set S3 region to empty value
  4. Got crash

Expected behavior Correct handling empty region on custom s3 providers.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: helm

danbka33 avatar Dec 08 '25 12:12 danbka33

I get the same error running in a single monolithic mode on Debian. My config works fine on v3.5.8 but updating to 3.6.2 crashes with

log.go:223 msg="error running loki" err="init compactor: failed to init delete store: operation error S3: DeleteObject, failed to resolve service endpoint, endpoint rule error, A region must be set when sending requests to S3.>

Config is

auth_enabled: false

server: grpc_server_max_recv_msg_size: 100000000 grpc_server_max_send_msg_size: 100000000

common: replication_factor: 1 ring: kvstore: store: inmemory

ingester: chunk_target_size: 1536000 max_chunk_age: 6h chunk_idle_period: 1h wal: dir: /var/lib/loki/wal

compactor: working_directory: /var/lib/loki/compactor retention_enabled: true delete_request_store: aws retention_delete_delay: 1h

query_scheduler: max_outstanding_requests_per_tenant: 10000 schema_config: configs: - from: 2021-01-01 store: boltdb-shipper object_store: s3 schema: v11 index: prefix: index_ period: 24h - from: 2022-05-08 store: boltdb-shipper object_store: s3 schema: v12 index: prefix: index_ period: 24h - from: 2023-04-06 store: tsdb object_store: s3 schema: v12 index: prefix: tsdb_index_ period: 24h - from: 2024-09-20 store: tsdb object_store: s3 schema: v13 index prefix: index_ period: 24h

storage_config: index_queries_cache_config:

boltdb_shipper: active_index_directory: /var/lib/loki/index cache_location: /var/lib/loki/boltdb-cache

aws: s3: s3://nnnnnnnnnn:nnnnnnnnnnnnnn@eu-west-1/loki.tideconnects.com

tsdb_shipper: active_index_directory: /var/lib/loki/tsdb-index cache_location: /var/lib/loki/tsdb-cache

limits_config: reject_old_samples: true reject_old_samples_max_age: 1h ingestion_rate_mb: 16 ingestion_burst_size_mb: 24 retention_period: 1y allow_structured_metadata: false max_query_length: 365d1h deletion_mode: filter-only

ruler: storage: type: local local: directory: /etc/loki/rules rule_path: /var/lib/loki/rules enable_api: true

stevenbrookes avatar Dec 09 '25 09:12 stevenbrookes

@stevenbrookes Yes, I also had to specify http/https in the endpoint. Before, it correctly recognized the endpoint without explicitly specifying it.

previouse:

s3:
      s3: "s3://${LOKI_S3_BUCKET_NAME}"
      endpoint: "${LOKI_S3_BUCKET_HOST}:${LOKI_S3_BUCKET_PORT}"

current:

s3:
      s3: "s3://${LOKI_S3_BUCKET_NAME}"
      endpoint: "http://${LOKI_S3_BUCKET_HOST}:${LOKI_S3_BUCKET_PORT}"

danbka33 avatar Dec 10 '25 06:12 danbka33

I managed to fix this. It wasn't related to S3 config at all, but due to a wrong file ownership of /var/lib/loki/compactor/deletion/series_progress.boltdb For some reason it was owned by root and loki running as loki. Leaving this here in case someone else gets similarly confused ;)

stevenbrookes avatar Dec 14 '25 10:12 stevenbrookes