loki icon indicating copy to clipboard operation
loki copied to clipboard

"Query does not fit in a single sharding configuration" after changing schema to TSDB

Open nektarios-d opened this issue 8 months ago • 7 comments

Describe the bug After changing schema config from boltdb-shipper to TSDB we get the following errors in the logs when we do a query. ts=2023-10-03T07:45:36.818777014Z caller=spanlogger.go:86 user=fake level=error msg="failed to get schema config, not applying querySizeLimit" err="Query does not fit in a single sharding configuration" ts=2023-10-03T07:45:36.818998318Z caller=spanlogger.go:86 middleware=QueryShard.astMapperware org_id=fake traceID=4a4b53d9c04c37e4 org_id=fake traceID=4a4b53d9c04c37e4 level=warn err="Query does not fit in a single sharding configuration" msg="skipped AST mapper for request"

To Reproduce Steps to reproduce the behavior:

  1. Add config for TSDB according to https://grafana.com/docs/loki/latest/operations/storage/tsdb/
  2. Restart loki container
  3. After the from date in the config issue a query to Loki

Expected behavior Loki to run without errors

Environment:

  • Infrastructure: docker
  • Loki version 2.9.1

Relevant Config:

limits_config:
  split_queries_by_interval: 10h
  ingestion_rate_mb: 128
  ingestion_burst_size_mb: 256
  max_streams_per_user: 0
  max_global_streams_per_user: 0
  per_stream_rate_limit: 80MB
  per_stream_rate_limit_burst: 100MB
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  max_query_series: 100000
  max_query_parallelism: 6
  tsdb_max_query_parallelism: 512 # default

query_scheduler:
  max_outstanding_requests_per_tenant: 32768

querier:
  query_ingesters_within: 7h
  max_concurrent: 16

schema_config:
  configs:
    - from: 2023-01-01
      store: boltdb-shipper
      object_store: aws
      schema: v11
      index:
        prefix: loki_index_
        period: 24h
    - from: "2023-10-03" # <---- A date in the future. The date we switch to TSDB from UTC 00:00:0
      index:
        period: 24h
        prefix: loki_index_
      object_store: aws
      schema: v12 # Current recommended schema version
      store: tsdb

nektarios-d avatar Oct 03 '23 08:10 nektarios-d

I think I can add some comments on this.

I updated the version from 2.8 to 2.9.1, to use the multi-store index-support concept.

Previously I was already using TSDB Shipper and this error was not caused.

So I believe that this correlation with the problem is not explicitly correlated with the inclusion of TSDB, but rather with the update to version 2.9.

image

image

https://github.com/grafana/loki/blob/3038170aee8d26cc7d5cad09fba5e77bf5e2231b/pkg/querier/queryrange/limits.go#L358

marcusteixeira avatar Oct 03 '23 14:10 marcusteixeira

@nektarios-d

How are your configurations related to storage_config?

shared_store assignments?

marcusteixeira avatar Oct 03 '23 14:10 marcusteixeira

@nektarios-d

How are your configurations related to storage_config?

shared_store assignments? @marcusteixeira

storage_config:
  aws:
    s3: "s3://<redacted>:<redacted>@eu-west-2/<bucket_name>
  boltdb_shipper:
    active_index_directory: /data/loki/index
    shared_store: s3
    cache_ttl: 24h
    cache_location: /data/loki/boltdb-cache
  tsdb_shipper:
    active_index_directory: /data/loki/tsdb-index
    cache_location: /data/loki/tsdb-cache
    cache_ttl: 24h                           # default
    shared_store: s3

nektarios-d avatar Oct 03 '23 14:10 nektarios-d

i'm also facing the same issue post migration to TSDB. i'm using loki 2.8.2

kadhamecha-conga avatar Dec 14 '23 02:12 kadhamecha-conga

I am also seeing the reported error message with Loki 2.9.3 and a changed schema.

It seems, except this error message, that the query results are fine nevertheless.

Looking into the code and the description in pull request #9050 which introduced it, helps to better understand it. This scenario can for example happen when querying a time range which spans different schemas. When only querying a time range which spans either the old or the new schema, no error is logged.

@trevorwhitney can you confirm this that this is not an issue in the described scenario?

darxriggs avatar Dec 14 '23 13:12 darxriggs

We saw the same error today (should it be a warning?) after change from v12 to v13 in preparation for Loki 3.0:

- from: 2024-04-25
  store: tsdb
  object_store: gcs
  schema: v13
  index:
    prefix: loki_index_
    period: 24h

lindeskar avatar Apr 25 '24 11:04 lindeskar