helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 4194304 bytes/sec)

Open dmitry-mightydevops opened this issue 3 years ago • 3 comments

Used the loki chart https://github.com/grafana/helm-charts/blob/main/charts/loki/values.yaml

get the following in my promtail pods:

level=warn ts=2021-08-09T01:02:45.604013738Z caller=client.go:344 component=client host=loki.monitoring:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 4194304 bytes/sec) while attempting to ingest '4963' lines totaling '705285' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
level=warn ts=2021-08-09T01:02:46.235952629Z caller=client.go:344 component=client host=loki.monitoring:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 4194304 bytes/sec) while attempting to ingest '4963' lines totaling '705285' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
level=warn ts=2021-08-09T01:02:47.618849145Z caller=client.go:344 component=client host=loki.monitoring:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 4194304 bytes/sec) while attempting to ingest '4963' lines totaling '705285' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
level=warn ts=2021-08-09T01:02:51.138024849Z caller=client.go:344 component=client host=loki.monitoring:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 4194304 bytes/sec) while attempting to ingest '4963' lines totaling '705285' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
level=warn ts=2021-08-09T01:02:56.281304488Z caller=client.go:344 component=client host=loki.monitoring:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 4194304 bytes/sec) while attempting to ingest '4963' lines totaling '705285' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"

values.promtail.yaml

resources:
  limits:
    cpu: 200m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 128Mi

config:
  lokiAddress: http://loki.monitoring:3100/loki/api/v1/push

values.loki.yaml

nodeSelector:
  ops: "true"

rbac:
  create: true
  pspEnabled: true

config:
  limits_config:
    enforce_metric_name: false
    reject_old_samples: true
    reject_old_samples_max_age: 168h
    ingestion_rate_mb: 10
    ingestion_burst_size_mb: 20

and so adding the following two lines into limits_config had no effect.

ingestion_rate_mb: 10
ingestion_burst_size_mb: 20

I'm using latest charts

✗  helm search repo loki                            
NAME                    	CHART VERSION	APP VERSION	DESCRIPTION                                       
grafana/loki            	2.6.0        	v2.3.0     	Loki: like Prometheus, but for logs.              
grafana/loki-canary     	0.4.0        	2.3.0      	Helm chart for Grafana Loki Canary                
grafana/loki-distributed	0.36.0       	2.3.0      	Helm chart for Grafana Loki in microservices mode 
grafana/loki-stack      	2.4.1        	v2.1.0     	Loki: like Prometheus, but for logs.              
loki/loki               	2.1.1        	v2.0.0     	DEPRECATED Loki: like Prometheus, but for logs.   
loki/loki-stack         	2.1.2        	v2.0.0     	DEPRECATED Loki: like Prometheus, but for logs.   
loki/fluent-bit         	2.0.2        	v2.0.0     	DEPRECATED Uses fluent-bit Loki go plugin for g...
loki/promtail           	2.0.2        	v2.0.0     	DEPRECATED Responsible for gathering logs and s...
grafana/fluent-bit      	2.3.0        	v2.1.0     	Uses fluent-bit Loki go plugin for gathering lo...
grafana/promtail        	3.7.0        	2.3.0      	Promtail is an agent which ships the contents o...
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
helm show values grafana/promtail

helm upgrade --install promtail grafana/promtail \
    --create-namespace \
    --namespace monitoring \
    --values cluster/production/charts/loki/values.promtail.yaml 

helm upgrade --install loki grafana/loki \
    --create-namespace \
    --namespace monitoring \
    --values cluster/production/charts/loki/values.loki.yaml 

dmitry-mightydevops avatar Aug 09 '21 02:08 dmitry-mightydevops

I even went inside the node running loki, inside the loki container and this is the config:

PID   USER     TIME  COMMAND
    1 loki      0:03 /usr/bin/loki -config.file=/etc/loki/loki.yaml
   46 loki      0:00 ash
   60 loki      0:00 ps aufx
/ $ cat /etc/loki/loki.yaml 
auth_enabled: false
chunk_store_config:
  max_look_back_period: 0s
compactor:
  shared_store: filesystem
  working_directory: /data/loki/boltdb-shipper-compactor
ingester:
  chunk_block_size: 262144
  chunk_idle_period: 3m
  chunk_retain_period: 1m
  lifecycler:
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
  max_transfer_retries: 0
limits_config:
  enforce_metric_name: false
  ingestion_burst_size_mb: 20
  ingestion_rate_mb: 10
  reject_old_samples: true
  reject_old_samples_max_age: 168h
schema_config:
  configs:
  - from: "2020-10-24"
    index:
      period: 24h
      prefix: index_
    object_store: filesystem
    schema: v11
    store: boltdb-shipper
server:
  http_listen_port: 3100
storage_config:
  boltdb_shipper:
    active_index_directory: /data/loki/boltdb-shipper-active
    cache_location: /data/loki/boltdb-shipper-cache
    cache_ttl: 24h
    shared_store: filesystem
  filesystem:
    directory: /data/loki/chunks
table_manager:
  retention_deletes_enabled: false
  retention_period: 0s/ $ 

So these values got applied (from the HELM chart), but I still get that original error I reported. I saw this discussion https://community.grafana.com/t/discarding-promtail-log-entries-en-masse/41128 but apparently something else I have missed.

dmitry-mightydevops avatar Aug 09 '21 02:08 dmitry-mightydevops

@dmitry-mightydevops try the per_stream_rate_limit https://grafana.com/docs/loki/latest/configuration/#limits_config

alexandre1984rj avatar Apr 29 '22 15:04 alexandre1984rj

My experimental config, which seemed to help get rid of the 429 code. Might come in handy.

      retention_period: 72h
      enforce_metric_name: false
      reject_old_samples: true
      reject_old_samples_max_age: 168h
      max_cache_freshness_per_query: 10m
      split_queries_by_interval: 15m
      # for big logs tune
      per_stream_rate_limit: 512M
      per_stream_rate_limit_burst: 1024M
      cardinality_limit: 200000
      ingestion_burst_size_mb: 1000
      ingestion_rate_mb: 10000
      max_entries_limit_per_query: 1000000
      max_label_value_length: 20480
      max_label_name_length: 10240
      max_label_names_per_series: 300

rmn-lux avatar Sep 28 '22 12:09 rmn-lux

@dmitry-mightydevops

I came along with same issue, do you have fixed it? Could you share your config? Thanks.

zzswang avatar Feb 26 '23 05:02 zzswang