loki icon indicating copy to clipboard operation
loki copied to clipboard

failed to get token ranges for ingester err="zone not set"

Open Assaf1928 opened this issue 1 year ago • 10 comments

After last version update (3.1) I get the next error: level=error ts=2024-07-04T12:19:47.79081202Z caller=recalculate_owned_streams.go:55 msg="failed to get token ranges for ingester" err="zone not set"

this is my config-yml file:

After last version update (3.1) I get the next error: level=error ts=2024-07-04T12:19:47.79081202Z caller=recalculate_owned_streams.go:55 msg="failed to get token ranges for ingester" err="zone not set"

this is my config-yml file:

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

common:
  instance_addr: 127.0.0.1
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100



limits_config:
  retention_period: 720h

schema_config:
  configs:
    - from: 2021-08-01
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h
    
# # chunk_store_config:
# #   max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: false
  retention_period: 0s

ruler:
  alertmanager_url: http://localhost:9093

 ingester:
   lifecycler:
     ring:
       kvstore:
         store: inmemory
       replication_factor: 1
     join_after: 30s
     final_sleep: 0s
   wal:
     enabled: true
     dir: /loki/wal

distributor:
  ring:
    kvstore:
      store: inmemory


compactor:
  working_directory: /loki/compactor

Assaf1928 avatar Jul 04 '24 13:07 Assaf1928

Had the same issue after a container update yesterday.

Checked my loki.yaml file and tried again today - and the image had been updated. New version does not have the issue. My Loki instance is now back up again.

I started up my promtail instances and received the following log entries:

promtail-1  | level=warn ts=2024-07-04T00:17:29.947533216Z caller=client.go:419 component=client host=192.168.1.93:3100 msg="error sending batch, will retry" status=-1 tenant=id-edge1 error="Post \"http://192.168.1.93:3100/loki/api/v1/push\": dial tcp 192.168.1.93:3100: connect: connection refused"

promtail-2  | level=warn ts=2024-07-04T13:53:36.198284808Z caller=promtail.go:263 msg="enable watchConfig"

promtail-3  | level=warn ts=2024-07-04T20:46:55.580209956Z caller=client.go:419 component=client host=192.168.1.93:3100 msg="error sending batch, will retry" status=500 tenant=id-services error="server returned HTTP status 500 Internal Server Error (500): empty ring"

promtail-4  | level=warn ts=2024-07-04T13:53:07.153379447Z caller=promtail.go:263 msg="enable watchConfig"
promtail-4  | level=warn ts=2024-07-04T20:46:55.565327215Z caller=client.go:419 component=client host=192.168.1.93:3100 msg="error sending batch, will retry" status=500 tenant=id-security error="server returned HTTP status 500 Internal Server Error (500): empty ring"

promtail-5  | level=warn ts=2024-07-04T00:18:31.464704864Z caller=promtail.go:263 msg="enable watchConfig"
promtail-5  | level=warn ts=2024-07-04T20:47:01.30170839Z caller=client.go:419 component=client host=192.168.1.93:3100 msg="error sending batch, will retry" status=-1 tenant=id-media error="Post \"http://192.168.1.93:3100/loki/api/v1/push\": dial tcp 192.168.1.93:3100: connect: connection refused"

Looks like I am still having issues with the inmemory ring key value store. Loki logs show no issues:

loki  | level=warn ts=2024-07-04T20:48:07.297659369Z caller=loki.go:288 msg="global timeout not configured, using default engine timeout (\"5m0s\"). This behavior will change in the next major to always use the default global timeout (\"5m\")."
loki  | level=warn ts=2024-07-04T20:48:07.312935977Z caller=cache.go:127 msg="fifocache config is deprecated. use embedded-cache instead"
loki  | level=warn ts=2024-07-04T20:48:07.312978812Z caller=experimental.go:20 msg="experimental feature in use" feature="In-memory (FIFO) cache - chunksembedded-cache"

Accessing the Loki metrics page displays current metrics, so the instance is running.

instantdreams avatar Jul 04 '24 20:07 instantdreams

New install of 3.1 gives me the same error

caller=recalculate_owned_streams.go:55 msg="failed to get token ranges for ingester" err="zone not set"

crazyelectron-io avatar Jul 23 '24 08:07 crazyelectron-io

I had the same issue with Loki 3.1.0 and the following configuration file:

---

auth_enabled: false

server:
  http_listen_port: 3100
  log_level: info

common:
  replication_factor: 1
  path_prefix: /loki
    
  storage:
    filesystem:
      chunks_directory: /loki/chunks

  ring:
    # instance_availability_zone: zone
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2024-05-28
      store: tsdb
      object_store: s3
      schema: v13
      index:
        prefix: index_
        period: 24h

compactor:
  retention_enabled: true
  retention_delete_delay: 2h
  delete_request_cancel_period: 24h
  delete_request_store: s3

ruler:
  enable_api: true
  enable_alertmanager_v2: true

  alertmanager_url: http://alertmanager.monitoring.svc:9093

  remote_write:
    enabled: true
    client:
      url: http://prometheus.monitoring.svc:9090/api/v1/write

  storage:
    type: s3
    s3:
      endpoint: minio.cloud.svc:9000
      s3forcepathstyle: true
      bucketnames: loki
      insecure: true
      access_key_id: ${S3_USER}
      secret_access_key: ${S3_PASS}
   
  wal:
    dir: /loki/ruler_wal

limits_config:
  retention_period: 720h
  max_query_length: 721h
  max_entries_limit_per_query: 1000000
  ingestion_rate_mb: 8 # MB, default is 4
  ingestion_burst_size_mb: 12 # MB, default is 6
  split_queries_by_interval: 15m
  unordered_writes: true
  retention_stream: []
  shard_streams:
    enabled: false
  volume_enabled: true
  discover_service_name: [ topic ]
  discover_log_levels: true

query_range:
  align_queries_with_step: true
  cache_results: true

query_scheduler:
  max_outstanding_requests_per_tenant: 2048

frontend:
  max_outstanding_per_tenant: 2048

analytics:
  reporting_enabled: false

ingester:
  max_chunk_age: 8h
  chunk_idle_period: 4h

pattern_ingester:
   enabled: true

storage_config:
 tsdb_shipper:
   active_index_directory: /loki/index
   cache_location: /loki/index_cache
 aws:
   s3: s3://${S3_USER}:${S3_PASS}@minio.cloud.svc:9000/loki
   s3forcepathstyle: true

However, when I un-comment the instance_availability_zone: zone line the error changes to "can't use ring configuration for computing token ranges"), which is defined in https://github.com/joe-elliott/dskit/blob/0e1c99b54ea7ef89ad80fa32cb2751cc0dbf5c32/ring/token_range.go#L77C26-L77C84

ondrejmo avatar Jul 25 '24 15:07 ondrejmo

I ran into this yesterday and gave up looking and went back to 3.0.0. I was using a fresh install of the latest Helm chart.

KAllan357 avatar Jul 25 '24 20:07 KAllan357

I'm hitting this with all 3.1 configurations I can think to try. Seems #13103 introduced recalculate_owned_streams.go. At a glance, it seems this is invoked no matter what the configuration is (aside from, presumably, singlebinary), and it doesn't properly pick up the RF and/or zone awareness fields.

Rolled back to 3.0.0 to get things online, but would love to be able to upgrade. :-)

cboggs avatar Aug 05 '24 13:08 cboggs

I see the same error for 3.1.0 , but the whole loki stack is working as expected. What is the cause of this error.

Hitesh-Agrawal avatar Aug 08 '24 06:08 Hitesh-Agrawal

I'm having the same problem. Can anyone solve the problem? Chart version: 6.10.0 Chart Name: loki Loki version: 3.1.1

zeyneprumeysayorulmaz avatar Aug 19 '24 09:08 zeyneprumeysayorulmaz

Same error here with latest version of the official loki chart.

Starefossen avatar Sep 10 '24 11:09 Starefossen

experiencing the same error without zone awareness replication of ingesters for Chart version: 6.12.0 Chart Name: loki caller=recalculate_owned_streams.go:55 msg="failed to get token ranges for ingester" err="can't use ring configuration for computing token ranges" Downgraded for now. Is there any updates on this?

kristeey avatar Sep 20 '24 07:09 kristeey

Same here, running single binary with filesystem storage and replication=1, so IIUC there are no zones in this configuration.

pikeas avatar Oct 18 '24 23:10 pikeas

I'm running the distributed-loki helm chart with image tag 3.2.1 and it seems setting zone_awareness_enabled: false in the common.ring configuration gets rid of the error in ingester logs, but grafana still displays "An error occured within the plugin" when I browse to the logs explorer, and it logs client: failed to call resources: empty ring

If pattern_ingester is disabled then I have no errors.

lesaux avatar Oct 25 '24 16:10 lesaux

Also experiencing this error in 3.1.1 version

msg="failed to get token ranges for ingester" err="zone not set"

Could this be the reason the ingester is getting stuck in the ring in an Unhealthy state?

https://github.com/grafana/loki/issues/14847

alexinfoblox avatar Nov 08 '24 13:11 alexinfoblox

Im getting the same error with 3.0.0 version.. Is it normal ?

Praveenk8051 avatar Nov 13 '24 13:11 Praveenk8051

I'm getting same error trying with latest version:

level=error ts=2025-01-10T07:52:05.242016583Z caller=recalculate_owned_streams.go:55 msg="failed to get token ranges for ingester" err="zone not set"

With version 3.0.0 this error doesn't appear

pbozzoli avatar Jan 10 '25 08:01 pbozzoli

has anyone solved it? getting the same one on :latest

mxlb-dev avatar Jan 25 '25 16:01 mxlb-dev

this doesn't occur with 2.9 version

Praveenk8051 avatar Jan 26 '25 07:01 Praveenk8051