Publish() should backoff if Elasticsearch returns 429 HTTP rate limiting responses
Describe the enhancement:
In some cases of high load, Elasticsearch will return 429 errors to indicate rate limiting. Beats should back off if it detects a HTTP 429 response.
Looking at the code in here, it looks like it does not do that and just re-sends: https://github.com/elastic/beats/blob/main/libbeat/outputs/elasticsearch/client.go#L187
Describe a specific use case for the enhancement or feature:
I have an Elasticsearch cluster, and in my network I have deployed 1500 elastic-agents. They are sending lots of logs to Elasticsearch, and I am routinely getting HTTP 429 errors.
On one hand, I am trying to scale the resources on the Elasticsearch server side.
However, it would be good if beats and elastic-agent could backoff if it detects HTTP 429 errors. Right now beats and elastic-agent seem to keep hammering on Elasticsearch if it returns HTTP 429 errors.
- Enhancment Request 19985 was created for this in Elastic Support case 01503357
The docs for backoff.init and backoff.max mention "network errors". I don't know if an HTTP 429 qualifies as a "network error", because really it is not a network error. For HTTP 429, the client was able to successfully make a network connection to the server, but the server decided to send back an HTTP 429 response.
If beats and elastic-agent could backoff in response to HTTP 429, that would be helpful, and I could tune the backoff parameters in elastic-agent policies.
Hi! We just realized that we haven't looked into this issue in a while. We're sorry!
We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
Fairly easy to reproduce this behavior
filebeat.yml
---
filebeat:
inputs:
- type: benchmark
id: my-benchmark-id
enabled: true
count: 100
output:
elasticsearch:
hosts:
- "http://localhost:9200"
mock-es
./mock-es -toomany 100 -metrics 5s
example of filebeat logs
{"log.level":"debug","@timestamp":"2025-06-25T09:44:16.681-0500","log.logger":"elasticsearch.elasticsearch","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/outputs/elasticsearch.(*Client).bulkCollectPublishFails","file.name":"elasticsearch/client.go","file.line":465},"message":"Bulk item insert failed (i=1, status=429): ","service.name":"filebeat","ecs.version":"1.6.0"}
mock-es metrics
{"bulk.create.too_many":{"count":255300},"bulk.create.total":{"count":2553},"license.total":{"count":1},"root.total":{"count":2}}
{"bulk.create.too_many":{"count":635700},"bulk.create.total":{"count":6357},"license.total":{"count":1},"root.total":{"count":2}}
{"bulk.create.too_many":{"count":1019500},"bulk.create.total":{"count":10195},"license.total":{"count":1},"root.total":{"count":2}}
{"bulk.create.too_many":{"count":1402300},"bulk.create.total":{"count":14023},"license.total":{"count":1},"root.total":{"count":2}}
{"bulk.create.too_many":{"count":1777900},"bulk.create.total":{"count":17779},"license.total":{"count":1},"root.total":{"count":2}}
Candidate fix should be up soon. New mock-es metrics:
{"bulk.create.too_many":{"count":200},"bulk.create.total":{"count":2},"license.total":{"count":2},"root.total":{"count":4}}
{"bulk.create.too_many":{"count":300},"bulk.create.total":{"count":3},"license.total":{"count":3},"root.total":{"count":6}}
{"bulk.create.too_many":{"count":400},"bulk.create.total":{"count":4},"license.total":{"count":4},"root.total":{"count":8}}
{"bulk.create.too_many":{"count":400},"bulk.create.total":{"count":4},"license.total":{"count":4},"root.total":{"count":8}}
{"bulk.create.too_many":{"count":500},"bulk.create.total":{"count":5},"license.total":{"count":5},"root.total":{"count":10}}
{"bulk.create.too_many":{"count":500},"bulk.create.total":{"count":5},"license.total":{"count":5},"root.total":{"count":10}}
{"bulk.create.too_many":{"count":500},"bulk.create.total":{"count":5},"license.total":{"count":5},"root.total":{"count":10}}
{"bulk.create.too_many":{"count":500},"bulk.create.total":{"count":5},"license.total":{"count":5},"root.total":{"count":10}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}
{"bulk.create.too_many":{"count":600},"bulk.create.total":{"count":6},"license.total":{"count":6},"root.total":{"count":12}}