Fluent-bit Connection Retry Issue
Issue Title
Fluent-bit continuously retries connecting to Elasticsearch with occasional connection reset errors
Issue Description
When deploying Fluent-bit using the provided configuration via Helm chart, logs are periodically forwarded to Elasticsearch. However, there seems to be an issue where Fluent-bit encounters connection errors, specifically "Connection reset by peer" and "Broken pipe," during the process of flushing chunks to Elasticsearch. Sometimes it is able to reconnect, but at other times, despite periodic retries, the issue persists intermittently, leading to a constant retrying state without successful log forwarding.
Fluent bit Configuration
serviceMonitor:
enabled: true
config:
customParsers: |
[PARSER]
Name docker_no_time
Format json
Time_Keep Off
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
[PARSER]
Name multiline
Format regex
Regex /(?
Expected Behavior
Fluent-bit should establish a stable connection to Elasticsearch without encountering frequent connection reset or broken pipe errors. Log forwarding should occur consistent.
Steps to Reproduce
- Deploy Fluent-bit using the provided configuration via helm chart.
- Observe the Fluent-bit logs and Elasticsearch connection status.
- Monitor the intermittent connection errors and constant retrying behavior.
Additional Information
- Fluent-bit version: latest fluent bit helm chart
- Elasticsearch version: v 7.17 (Using Managed Elastic Search)
Following are the log snippets that I am experiencing in my case
When it retries connection and is successful:
When it retries connection and is unsuccessful:
Any assistance with this issue would be highly appreciated, as I need to address it before deploying to production.
hi @ShahzaibAhmedKhan31 , which Fluent Bit version are you using ? (note that I am noticing that the remote endpoint is closing the connection).
are you in the last v2.2.x ?
Hello @edsiper, thank you for your response. I am using Fluent Bit version 2.2.0. During the time when Fluent Bit was retrying its connection, I tried to insert data through Postman using the Elasticsearch API and was successful in doing so. Therefore, I believe Elasticsearch was up and running normally at that time. (Note that I have installed Fluent Bit through the Helm chart)
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
This issue was closed because it has been stalled for 5 days with no activity.
@ShahzaibAhmedKhan31 have you resolved it? I have the same issue right now.