Compactor retention with tsdb_shipper does not work
Describe the bug compactor retention does not work with tsdb shipper, AWS s3 object storage.
To Reproduce Steps to reproduce the behavior:
- Install Helm chart with custom value file(
ssd-values.yaml)helm install loki grafana/loki --version 5.41.8 --namespace loki-ns -f ssd-values.yaml# ssd-values.yaml loki: # -- The number of old ReplicaSets to retain to allow rollback revisionHistoryLimit: 2 # -- Config file contents for Loki # @default -- See values.yaml config: | {{- if .Values.enterprise.enabled}} {{- tpl .Values.enterprise.config . }} {{- else }} auth_enabled: {{ .Values.loki.auth_enabled }} {{- end }} {{- with .Values.loki.server }} server: {{- toYaml . | nindent 2}} {{- end}} memberlist: {{- if .Values.loki.memberlistConfig }} {{- toYaml .Values.loki.memberlistConfig | nindent 2 }} {{- else }} {{- if .Values.loki.extraMemberlistConfig}} {{- toYaml .Values.loki.extraMemberlistConfig | nindent 2}} {{- end }} join_members: - {{ include "loki.memberlist" . }} {{- with .Values.migrate.fromDistributed }} {{- if .enabled }} - {{ .memberlistService }} {{- end }} {{- end }} {{- end }} {{- with .Values.loki.ingester }} ingester: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- if .Values.loki.commonConfig}} common: {{- toYaml .Values.loki.commonConfig | nindent 2}} storage: {{- include "loki.commonStorageConfig" . | nindent 4}} {{- end}} {{- with .Values.loki.limits_config }} limits_config: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} runtime_config: file: /etc/loki/runtime-config/runtime-config.yaml {{- with .Values.loki.memcached.chunk_cache }} {{- if and .enabled (or .host .addresses) }} chunk_store_config: chunk_cache_config: memcached: batch_size: {{ .batch_size }} parallelism: {{ .parallelism }} memcached_client: {{- if .host }} host: {{ .host }} {{- end }} {{- if .addresses }} addresses: {{ .addresses }} {{- end }} service: {{ .service }} {{- end }} {{- end }} {{- if .Values.loki.schemaConfig }} schema_config: {{- toYaml .Values.loki.schemaConfig | nindent 2}} {{- else }} schema_config: configs: - from: 2022-01-11 store: boltdb-shipper object_store: {{ .Values.loki.storage.type }} schema: v12 index: prefix: loki_index_ period: 24h {{- end }} {{ include "loki.rulerConfig" . }} {{- if or .Values.tableManager.retention_deletes_enabled .Values.tableManager.retention_period }} table_manager: retention_deletes_enabled: {{ .Values.tableManager.retention_deletes_enabled }} retention_period: {{ .Values.tableManager.retention_period }} {{- end }} {{- with .Values.loki.memcached.results_cache }} query_range: align_queries_with_step: true {{- if and .enabled (or .host .addresses) }} cache_results: {{ .enabled }} results_cache: cache: default_validity: {{ .default_validity }} memcached_client: {{- if .host }} host: {{ .host }} {{- end }} {{- if .addresses }} addresses: {{ .addresses }} {{- end }} service: {{ .service }} timeout: {{ .timeout }} {{- end }} {{- end }} {{- with .Values.loki.storage_config }} storage_config: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.query_scheduler }} query_scheduler: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.compactor }} compactor: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.analytics }} analytics: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.querier }} querier: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.index_gateway }} index_gateway: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.frontend }} frontend: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.frontend_worker }} frontend_worker: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} {{- with .Values.loki.distributor }} distributor: {{- tpl (. | toYaml) $ | nindent 4 }} {{- end }} tracing: enabled: {{ .Values.loki.tracing.enabled }} # Should authentication be enabled auth_enabled: true # -- Check https://grafana.com/docs/loki/latest/configuration/#server for more info on the server configuration. server: log_format: "logfmt" log_level: "info" log_source_ips_enabled: true log_request_headers: true log_request_at_info_level_enabled: true # -- Limits config limits_config: max_line_size: 10KB per_stream_rate_limit: 5MB per_stream_rate_limit_burst: 20MB split_queries_by_interval: 15m retention_period: 7d retention_stream: - selector: '{environment="dev"}' priority: 1 period: 1d - selector: '{environment="stg"}' priority: 1 period: 2d shard_streams: enabled: false allow_structured_metadata: true # -- Provides a reloadable runtime configuration file for some specific configuration runtimeConfig: {} # -- Check https://grafana.com/docs/loki/latest/configuration/#common_config for more info on how to provide a common configuration commonConfig: path_prefix: /var/loki replication_factor: 3 ring: kvstore: store: "memberlist" compactor_address: '{{ include "loki.compactorAddress" . }}' # -- Storage config. Providing this will automatically populate all necessary storage configs in the templated config. storage: bucketNames: chunks: kps-shr-tools-s3-loki ruler: kps-shr-tools-s3-loki type: s3 s3: region: ap-northeast-2 # -- Configure memcached as an external cache for chunk and results cache. Disabled by default # must enable and specify a host for each cache you would like to use. memcached: chunk_cache: enabled: false results_cache: enabled: false # -- Check https://grafana.com/docs/loki/latest/configuration/#schema_config for more info on how to configure schemas schemaConfig: configs: - from: "2024-01-01" store: tsdb object_store: s3 schema: v13 index: prefix: tsdb_index_ period: 24h # -- Check https://grafana.com/docs/loki/latest/configuration/#ruler for more info on configuring ruler rulerConfig: {} # -- Structured loki configuration, takes precedence over `loki.config`, `loki.schemaConfig`, `loki.storageConfig` structuredConfig: common: storage: s3: storage_class: "STANDARD" hedging: at: 250ms up_to: 3 max_per_second: 20 query_range: results_cache: cache: enable_fifocache: false embedded_cache: enabled: true max_size_mb: 150 ttl: 30m compression: "snappy" cache_results: true cache_index_stats_results: false # -- Additional query scheduler config query_scheduler: max_outstanding_requests_per_tenant: 32768 querier_forget_delay: 60s # -- Additional storage config storage_config: aws: bucketnames: kps-shr-tools-s3-loki region: ap-northeast-2 insecure: false storage_class: "STANDARD" tsdb_shipper: active_index_directory: /var/loki/ingester/tsdb_shipper shared_store: "s3" shared_store_key_prefix: "tsdb_shipper/" cache_location: /var/loki/index_gateway/tsdb_shipper index_gateway_client: log_gateway_requests: true # -- Optional compactor configuration compactor: working_directory: "/var/loki/compactor" shared_store: "s3" shared_store_key_prefix: "compoactor/" retention_enabled: true compactor_ring: kvstore: store: "memberlist" # -- Optional analytics configuration analytics: reporting_enabled: false # -- Optional querier configuration querier: tail_max_duration: 30m max_concurrent: 16 multi_tenant_queries_enabled: true # -- Optional ingester configuration ingester: lifecycler: ring: kvstore: store: "memberlist" final_sleep: 15s wal: dir: "/var/loki/ingester/wal" flush_on_shutdown: true replay_memory_ceiling: 1GB # -- Optional index gateway configuration index_gateway: mode: ring ring: kvstore: store: "memberlist" frontend: scheduler_address: '{{ include "loki.querySchedulerAddress" . }}' log_queries_longer_than: 5s query_stats_enabled: true scheduler_dns_lookup_period: 3s compress_responses: true frontend_worker: match_max_concurrent: true scheduler_address: '{{ include "loki.querySchedulerAddress" . }}' # -- Optional distributor configuration distributor: ring: kvstore: store: "memberlist" rate_store: debug: true write_failures_logging: add_insights_label: true # -- Enable tracing tracing: enabled: true enterprise: # Enable enterprise features, license must be provided enabled: false # -- Options that may be necessary when performing a migration from another helm chart migrate: # -- When migrating from a distributed chart like loki-distributed or enterprise-logs fromDistributed: # -- Set to true if migrating from a distributed helm chart enabled: false serviceAccount: # -- Specifies whether a ServiceAccount should be created create: true # -- The name of the ServiceAccount to use. # If not set and create is true, a name is generated using the fullname template name: loki-sa # -- Annotations for the service account annotations: eks.amazonaws.com/role-arn: (...skip...) # -- Set this toggle to false to opt out of automounting API credentials for the service account automountServiceAccountToken: true # RBAC configuration rbac: # -- If pspEnabled true, a PodSecurityPolicy is created for K8s that use psp. pspEnabled: false # -- For OpenShift set pspEnabled to 'false' and sccEnabled to 'true' to use the SecurityContextConstraints. sccEnabled: false # -- Section for configuring optional Helm test test: enabled: false # Monitoring section determines which monitoring features to enable monitoring: # Dashboards for monitoring Loki dashboards: # -- If enabled, create configmap with dashboards for monitoring Loki enabled: false # Recording rules for monitoring Loki, required for some dashboards rules: # -- If enabled, create PrometheusRule resource with Loki recording rules enabled: false # -- Include alerting rules alerting: false # ServiceMonitor configuration serviceMonitor: # -- If enabled, ServiceMonitor resources for Prometheus Operator are created enabled: false # Self monitoring determines whether Loki should scrape its own logs. # This feature currently relies on the Grafana Agent Operator being installed, # which is installed by default using the grafana-agent-operator sub-chart. # It will create custom resources for GrafanaAgent, LogsInstance, and PodLogs to configure # scrape configs to scrape its own logs with the labels expected by the included dashboards. selfMonitoring: enabled: false # The Loki canary pushes logs to and queries from this loki installation to test # that it's working correctly lokiCanary: enabled: false # Configuration for the write pod(s) write: # -- Number of replicas for the write replicas: 3 autoscaling: # -- Enable autoscaling for the write. enabled: false # -- Comma-separated list of Loki modules to load for the write targetModule: "write" # -- Resource requests and limits for the write resources: limits: cpu: 1.5 memory: 2Gi requests: cpu: 500m memory: 500Mi # -- Grace period to allow the write to shutdown before it is killed. Especially for the ingester, # this must be increased. It must be long enough so writes can be gracefully shutdown flushing/transferring # all data and to successfully leave the member ring on shutdown. terminationGracePeriodSeconds: 300 # -- The default is to deploy all pods in parallel. podManagementPolicy: "Parallel" persistence: # -- Enable volume claims in pod spec volumeClaimsEnabled: true # -- Enable StatefulSetAutoDeletePVC feature enableStatefulSetAutoDeletePVC: false # -- Storage class to be used. # If defined, storageClassName: <storageClass>. # If set to "-", storageClassName: "", which disables dynamic provisioning. # If empty or set to null, no storageClassName spec is # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack). storageClass: loki-sc # Configuration for the table-manager tableManager: # -- Specifies whether the table-manager should be enabled enabled: false # Configuration for the read pod(s) read: # -- Number of replicas for the read replicas: 2 autoscaling: # -- Enable autoscaling for the read, this is only used if `queryIndex.enabled: true` enabled: false # -- Comma-separated list of Loki modules to load for the read targetModule: "read" # -- Whether or not to use the 2 target type simple scalable mode (read, write) or the # 3 target type (read, write, backend). Legacy refers to the 2 target type, so true will # run two targets, false will run 3 targets. legacyReadTarget: false # -- Resource requests and limits for the read resources: limits: cpu: 1.5 memory: 2Gi requests: cpu: 500m memory: 500Mi # -- Grace period to allow the read to shutdown before it is killed terminationGracePeriodSeconds: 30 # Configuration for the backend pod(s) backend: # -- Number of replicas for the backend replicas: 2 autoscaling: # -- Enable autoscaling for the backend. enabled: false # -- Comma-separated list of Loki modules to load for the read targetModule: "backend" # -- Resource requests and limits for the backend resources: limits: cpu: 1 memory: 1Gi requests: cpu: 500m memory: 500Mi # -- Grace period to allow the backend to shutdown before it is killed. Especially for the ingester, # this must be increased. It must be long enough so backends can be gracefully shutdown flushing/transferring # all data and to successfully leave the member ring on shutdown. terminationGracePeriodSeconds: 300 podManagementPolicy: "Parallel" persistence: # -- Enable volume claims in pod spec volumeClaimsEnabled: true # -- Enable StatefulSetAutoDeletePVC feature enableStatefulSetAutoDeletePVC: true # -- Storage class to be used. # If defined, storageClassName: <storageClass>. # If set to "-", storageClassName: "", which disables dynamic provisioning. # If empty or set to null, no storageClassName spec is # set, choosing the default provisioner (gp2 on AWS, standard on GKE, AWS, and OpenStack). storageClass: loki-sc # Configuration for the single binary node(s) singleBinary: # -- Number of replicas for the single binary replicas: 0 # Use either this ingress or the gateway, but not both at once. # If you enable this, make sure to disable the gateway. # You'll need to supply authn configuration for your ingress controller. ingress: enabled: true ingressClassName: "alb" annotations: (...skip...) paths: (...skip...) hosts: (...skip...) # Configuration for the memberlist service memberlist: service: publishNotReadyAddresses: false # Configuration for the gateway gateway: # -- Specifies whether the gateway should be enabled enabled: false networkPolicy: # -- Specifies whether Network Policies should be created enabled: false # ------------------------------------- # Configuration for `minio` child chart # ------------------------------------- minio: enabled: false sidecar: rules: # -- Whether or not to create a sidecar to ingest rule from specific ConfigMaps and/or Secrets. enabled: false - push logs with Promtail to Loki(stream:
{environment="dev" ...}, tenant id:kurlypay)
Expected behavior A clear and concise description of what you expected to happen.
- stream matching
{environment="dev"}for global retention does not been marked after 1d(reteion period) + 2h(retention_delete_delay)- saved log timestamp(UTC+0900):
2024-01-26 16:08:11.189+0900 - expected log deletion timestamp:
2024-01-27 18:08:11.189+0900(retention period + retention delete delay)
- saved log timestamp(UTC+0900):
Environment:
- Infrastructure: EKS(k8s 1.24), s3(using IRSA for object upload/delete... access)
- Deployment tool: Helm
Screenshots, Promtail config, or terminal output If applicable, add any output to help explain your problem.
- Loki's k8s ConfigMap(config.yaml)
analytics: reporting_enabled: false auth_enabled: true common: compactor_address: 'loki-backend' path_prefix: /var/loki replication_factor: 3 ring: kvstore: store: memberlist storage: hedging: at: 250ms max_per_second: 20 up_to: 3 s3: bucketnames: kps-shr-tools-s3-loki insecure: false region: ap-northeast-2 s3forcepathstyle: false storage_class: STANDARD compactor: compactor_ring: kvstore: store: memberlist retention_enabled: true shared_store: s3 shared_store_key_prefix: compoactor/ working_directory: /var/loki/compactor distributor: rate_store: debug: true ring: kvstore: store: memberlist write_failures_logging: add_insights_label: true frontend: compress_responses: true log_queries_longer_than: 5s query_stats_enabled: true scheduler_address: query-scheduler-discovery.loki-ns.svc.cluster.local.:9095 scheduler_dns_lookup_period: 3s frontend_worker: match_max_concurrent: true scheduler_address: query-scheduler-discovery.loki-ns.svc.cluster.local.:9095 index_gateway: mode: ring ring: kvstore: store: memberlist ingester: lifecycler: final_sleep: 15s ring: kvstore: store: memberlist wal: dir: /var/loki/ingester/wal flush_on_shutdown: true replay_memory_ceiling: 1GB limits_config: allow_structured_metadata: true max_cache_freshness_per_query: 10m max_line_size: 10KB per_stream_rate_limit: 5MB per_stream_rate_limit_burst: 20MB reject_old_samples: true reject_old_samples_max_age: 168h retention_period: 7d retention_stream: - period: 1d priority: 1 selector: '{environment="dev"}' - period: 2d priority: 1 selector: '{environment="stg"}' shard_streams: enabled: false split_queries_by_interval: 15m memberlist: join_members: - loki-memberlist querier: max_concurrent: 16 multi_tenant_queries_enabled: true tail_max_duration: 30m query_range: align_queries_with_step: true cache_index_stats_results: false cache_results: true results_cache: cache: embedded_cache: enabled: true max_size_mb: 150 ttl: 30m enable_fifocache: false compression: snappy query_scheduler: max_outstanding_requests_per_tenant: 32768 querier_forget_delay: 60s ruler: storage: s3: bucketnames: kps-shr-tools-s3-loki insecure: false region: ap-northeast-2 s3forcepathstyle: false type: s3 runtime_config: file: /etc/loki/runtime-config/runtime-config.yaml schema_config: configs: - from: "2024-01-01" index: period: 24h prefix: tsdb_index_ object_store: s3 schema: v13 store: tsdb server: grpc_listen_port: 9095 http_listen_port: 3100 log_format: logfmt log_level: info log_request_at_info_level_enabled: true log_request_headers: true log_source_ips_enabled: true storage_config: aws: bucketnames: kps-shr-tools-s3-loki insecure: false region: ap-northeast-2 storage_class: STANDARD hedging: at: 250ms max_per_second: 20 up_to: 3 tsdb_shipper: active_index_directory: /var/loki/ingester/tsdb_shipper cache_location: /var/loki/index_gateway/tsdb_shipper index_gateway_client: log_gateway_requests: true shared_store: s3 shared_store_key_prefix: tsdb_shipper/ tracing: enabled: true - Grafana explore(queried at
2023-01-28 00:21+0900) - S3
- backend target logs output
level=info ts=2024-01-27T07:01:14.865219404Z caller=compactor.go:517 msg="applying retention with compaction" level=info ts=2024-01-27T07:01:14.865256731Z caller=expiration.go:78 msg="overall smallest retention period 1706252474.865, default smallest retention period 1706252474.865" ts=2024-01-27T07:01:14.86529407Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:01:14.946262639Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=80.958298ms level=info ts=2024-01-27T07:02:14.808135172Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:03:14.807959275Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:04:14.808085233Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:05:14.808426886Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:06:08.655238728Z caller=table_manager.go:228 index-store=tsdb-2024-01-01 msg="syncing tables" ts=2024-01-27T07:06:08.655315103Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.705822649Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=50.499015ms ts=2024-01-27T07:06:08.705869075Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.72114278Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.268517ms ts=2024-01-27T07:06:08.721188867Z caller=spanlogger.go:86 level=info msg="building table cache" ts=2024-01-27T07:06:08.741034208Z caller=spanlogger.go:86 level=info msg="table cache built" duration=19.839877ms ts=2024-01-27T07:06:08.74110291Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.75919025Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=18.080712ms ts=2024-01-27T07:06:08.759236094Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.773159942Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=13.919151ms ts=2024-01-27T07:06:08.773199339Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.818762004Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=45.556466ms ts=2024-01-27T07:06:08.818802868Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.8361649Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=17.356727ms ts=2024-01-27T07:06:08.836208945Z caller=spanlogger.go:86 level=info msg="building table cache" ts=2024-01-27T07:06:08.850882986Z caller=spanlogger.go:86 level=info msg="table cache built" duration=14.668228ms ts=2024-01-27T07:06:08.8509299Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.866052312Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.118183ms ts=2024-01-27T07:06:08.866082521Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.882078327Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.990817ms ts=2024-01-27T07:06:08.882121167Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.900934931Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=18.809387ms ts=2024-01-27T07:06:08.900965805Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.917191332Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=16.221127ms ts=2024-01-27T07:06:08.917222594Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.931074256Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=13.843814ms ts=2024-01-27T07:06:08.931099495Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:06:08.944555033Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=13.451656ms level=info ts=2024-01-27T07:06:08.944579799Z caller=table_manager.go:271 index-store=tsdb-2024-01-01 msg="query readiness setup completed" duration=2.952µs distinct_users_len=0 distinct_users= level=info ts=2024-01-27T07:06:14.808627191Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:07:14.808759172Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:08:14.807985004Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:09:14.80859647Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:10:14.808345602Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:11:08.655754186Z caller=table_manager.go:228 index-store=tsdb-2024-01-01 msg="syncing tables" ts=2024-01-27T07:11:08.655826266Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.708818756Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=52.983678ms ts=2024-01-27T07:11:08.708862067Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.725769394Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=16.902595ms ts=2024-01-27T07:11:08.725807396Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.74124526Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.433144ms ts=2024-01-27T07:11:08.741285717Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.756958028Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.667322ms ts=2024-01-27T07:11:08.756997287Z caller=spanlogger.go:86 level=info msg="building table cache" ts=2024-01-27T07:11:08.773056633Z caller=spanlogger.go:86 level=info msg="table cache built" duration=16.054893ms ts=2024-01-27T07:11:08.77310801Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.788781492Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.669059ms ts=2024-01-27T07:11:08.788805885Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.804863018Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=16.05236ms ts=2024-01-27T07:11:08.804891816Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.820865821Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.969768ms ts=2024-01-27T07:11:08.82089105Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.836186464Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=15.29177ms ts=2024-01-27T07:11:08.836213075Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.854480054Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=18.263026ms ts=2024-01-27T07:11:08.854503319Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.873903565Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=19.395869ms ts=2024-01-27T07:11:08.873936021Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.890227641Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=16.287615ms ts=2024-01-27T07:11:08.890249099Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:08.905204725Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=14.951348ms ts=2024-01-27T07:11:08.905235933Z caller=spanlogger.go:86 level=info msg="building table cache" ts=2024-01-27T07:11:08.920680952Z caller=spanlogger.go:86 level=info msg="table cache built" duration=15.43968ms level=info ts=2024-01-27T07:11:08.920720759Z caller=table_manager.go:271 index-store=tsdb-2024-01-01 msg="query readiness setup completed" duration=2.59µs distinct_users_len=0 distinct_users= level=info ts=2024-01-27T07:11:14.808141385Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:11:14.86529441Z caller=compactor.go:517 msg="applying retention with compaction" level=info ts=2024-01-27T07:11:14.865334949Z caller=expiration.go:78 msg="overall smallest retention period 1706253074.865, default smallest retention period 1706253074.865" ts=2024-01-27T07:11:14.865372091Z caller=spanlogger.go:86 level=info msg="building table names cache" ts=2024-01-27T07:11:14.914375642Z caller=spanlogger.go:86 level=info msg="table names cache built" duration=48.995438ms level=info ts=2024-01-27T07:12:14.807707634Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:13:14.808146732Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:14:14.808286317Z caller=marker.go:202 msg="no marks file found" level=info ts=2024-01-27T07:15:14.808442889Z caller=marker.go:202 msg="no marks file found"
I tested below values.yaml file and checked compaction and retention is working.
loki:
auth_enabled: false
limits_config:
retention_period: 1d
commonConfig:
replication_factor: 2
storage:
bucketNames:
chunks: kps-shr-tools-s3-loki-test
ruler: kps-shr-tools-s3-loki-test
s3:
region: ap-northeast-2
storage_config:
boltdb_shipper:
active_index_directory: /var/loki/data/index
cache_location: /var/loki/data/boltdb-cache
shared_store: s3
compactor:
working_directory: /var/loki/data/retention
shared_store: s3
retention_delete_delay: 30m
compaction_interval: 10m
retention_enabled: true
retention_delete_worker_count: 150
serviceAccount:
name: loki-sa
imagePullSecrets: []
annotations:
eks.amazonaws.com/role-arn: (...skip...)
rules:
enabled: false
alerting: false
serviceMonitor:
enabled: false
lokiCanary:
enabled: false
write:
replicas: 2
persistence:
storageClass: loki-sc
read:
replicas: 2
persistence:
storageClass: loki-sc
backend:
replicas: 2
persistence:
storageClass: loki-sc
gateway:
enabled: false
extraObjects:
- apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: loki-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: (...skip...)
directoryPerms: "700"
uid: '{{ .Values.loki.podSecurityContext.runAsUser }}'
gid: '{{ .Values.loki.podSecurityContext.runAsGroup }}'
What did I miss? Let me know, please
Finally, I found the cause. It's a problem caused by the different value between-tsdb.shipper.shared-store.key-prefix and -compactor.shared-store.key-prefix.
I simply thought the compactor was using -compactor.shared-store.key-prefix flag for deletion purposes not for compact and retention. But it wasn't.
I hope this will be added to the official Loki documentation. There are each option for compactor and writer, so there might be people who have the same misconception as me.
Hi, Can you elaborate on this? Did you set each value separately?
@icanhazbeer That's right. I set two runtime flag values to different values.
-
-tsdb.shipper.shared-store.key-prefix -
-compactor.shared-store.key-prefix
And the results of the test are as follows.
- The log entry deletion request is saved at
-compactor.shared-store.key-prefixby the compactor. - The position compactor refers to for compaction and retention is
-compactor.shared-store.key-prefix(Before the test, I thought comfactor was referring to-tsdb.shipper.shared-store.key-prefix)
As a result, we can see that the two flags must always have the same value in order for a comacptor to perform a compaction, retention properly.