tempo icon indicating copy to clipboard operation
tempo copied to clipboard

Ingester fails with "error searching block" when "replication_factor" is set to 3

Open jon-rei opened this issue 1 year ago • 1 comments

Describe the bug

Our ingester logs the following errors and warnings:

level=warn ts=2023-12-15T08:49:13.134664609Z caller=server.go:1184 method=/tempopb.Querier/FindTraceByID duration=9.793361ms msg=gRPC err="completeBlock.FindTraceByID failed: error reading row from backend: missing required field: 1:FIELD<I32>"
level=error ts=2023-12-15T08:52:13.142414816Z caller=instance_search.go:144 msg="error searching block" blockID=88645ed5-56d3-4054-a392-7b6aea7a2a34 err="error opening file /bitnami/grafana-tempo/data/wal/88645ed5-56d3-4054-a392-7b6aea7a2a34+single-tenant+vParquet3/0000000029: error opening parquet file: reading magic footer of parquet file: context canceled"

We run the Ingester with 3 replicas and replication_factor: 3. We used to run the Ingester with only 2 replicas and replication_factor: 1. If we change our current setup back to replication_factor 1 or even 2, the error is gone.

Currently we are a little bit confused by those errors. They are happening quite frequently but it does not seem to break anything.

To Reproduce At least reproducable in our environment throught all of our clusters:

  1. Run Ingesters with 3 replicas and replication_factor: 1
  2. Change replication_factor to 3

Expected behavior Running with replication_factor: 3 works.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: bitnami helm chart
  • Tempo v2.3.1

Additional Context We needed to tune the limits quite a lot.

configs

Overrides:

overrides:
  "*":
    max_traces_per_user: 5000000
    ingestion_burst_size_bytes: 500000000
    ingestion_rate_limit_bytes: 500000000
    metrics_generator_processors:
      - service-graphs
      - span-metrics

Config:

usage_report:
  reporting_enabled: false
multitenancy_enabled: false
compactor:
  compaction:
    block_retention: 168h
  ring:
    kvstore:
      store: memberlist
metrics_generator:
  processor:
    service_graphs:
      max_items: 100000
      dimensions:
      - k8s.namespace.name
    span_metrics:
      dimensions:
      - k8s.namespace.name
  ring:
    kvstore:
      store: memberlist
  storage:
    path: {{ .Values.tempo.dataDir }}/wal
    remote_write: {{ include "common.tplvalues.render" (dict "value" .Values.metricsGenerator.remoteWrite "context" $) | nindent 6 }}
distributor:
  ring:
    kvstore:
      store: memberlist
  receivers:
...snip
query_frontend:
  max_outstanding_per_tenant: 3000
  search:
    concurrent_jobs: 2000
querier:
  frontend_worker:
    frontend_address: {{ include "grafana-tempo.query-frontend.fullname" . }}-headless:{{ .Values.queryFrontend.service.ports.grpc }}
    grpc_client_config:
      max_send_msg_size: 500000000
  trace_by_id:
    query_timeout: "60s"
  search:
    query_timeout: "120s"
  max_concurrent_queries: 50
ingester:
  lifecycler:
    ring:
      replication_factor: 3
      kvstore:
        store: memberlist
    tokens_file_path: {{ .Values.tempo.dataDir }}/tokens.json
memberlist:
  abort_if_cluster_join_fails: false
  join_members:
    - {{ include "grafana-tempo.gossip-ring.fullname" . }}
overrides:
  per_tenant_override_config: /bitnami/grafana-tempo/conf/overrides.yaml
server:
  http_server_read_timeout: "60s"
  http_server_write_timeout: "60s"
  http_listen_port: {{ .Values.tempo.containerPorts.web }}
  grpc_server_max_recv_msg_size: 500000000
  grpc_server_max_send_msg_size: 500000000
storage:
  trace:
    backend: s3
    s3:
      endpoint: ${BUCKET_HOST}
      bucket: ${BUCKET_NAME}
      access_key: ${AWS_ACCESS_KEY_ID}
      secret_key: ${AWS_SECRET_ACCESS_KEY}
      forcepathstyle: true
      insecure: true
    local:
      path: {{ .Values.tempo.dataDir }}/traces
    wal:
      path: {{ .Values.tempo.dataDir }}/wal
    cache: memcached
    memcached:
      consistent_hash: true
      host: {{ include "grafana-tempo.memcached.url" . }}
      service: memcache
      timeout: 500ms
    search:
      cache_control:
        footer: true
        column_index: true
        offset_index: true
    block:
      version: vParquet3
    pool:
      max_workers: 100
    cache_min_compaction_level: 1
    cache_max_block_age: 48h

jon-rei avatar Dec 15 '23 09:12 jon-rei

Good catch. This is a innocuous error. In RF3 if the querier gets 2 responses it can be assured that it has received all spans for a trace (due to read/write quorum) and so it cancels the final request.

We should log context cancelled on query at the debug level in the ingester.

joe-elliott avatar Jan 03 '24 16:01 joe-elliott

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity. Please apply keepalive label to exempt this Issue.

github-actions[bot] avatar Mar 04 '24 00:03 github-actions[bot]