fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

Fluent-bit Connection Retry Issue

Open ShahzaibAhmedKhan31 opened this issue 1 year ago • 3 comments

Issue Title

Fluent-bit continuously retries connecting to Elasticsearch with occasional connection reset errors

Issue Description

When deploying Fluent-bit using the provided configuration via Helm chart, logs are periodically forwarded to Elasticsearch. However, there seems to be an issue where Fluent-bit encounters connection errors, specifically "Connection reset by peer" and "Broken pipe," during the process of flushing chunks to Elasticsearch. Sometimes it is able to reconnect, but at other times, despite periodic retries, the issue persists intermittently, leading to a constant retrying state without successful log forwarding.

Fluent bit Configuration


serviceMonitor:
  enabled: true
config:
  customParsers: |
    [PARSER]
        Name docker_no_time
        Format json
        Time_Keep Off
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L

    [PARSER]
        Name multiline
        Format regex
        Regex /(?

Expected Behavior

Fluent-bit should establish a stable connection to Elasticsearch without encountering frequent connection reset or broken pipe errors. Log forwarding should occur consistent.

Steps to Reproduce

  1. Deploy Fluent-bit using the provided configuration via helm chart.
  2. Observe the Fluent-bit logs and Elasticsearch connection status.
  3. Monitor the intermittent connection errors and constant retrying behavior.

Additional Information

  • Fluent-bit version: latest fluent bit helm chart
  • Elasticsearch version: v 7.17 (Using Managed Elastic Search)

Following are the log snippets that I am experiencing in my case

When it retries connection and is successful:

log1


When it retries connection and is unsuccessful:

log2

Any assistance with this issue would be highly appreciated, as I need to address it before deploying to production.

ShahzaibAhmedKhan31 avatar Jan 22 '24 12:01 ShahzaibAhmedKhan31

hi @ShahzaibAhmedKhan31 , which Fluent Bit version are you using ? (note that I am noticing that the remote endpoint is closing the connection).

are you in the last v2.2.x ?

edsiper avatar Jan 22 '24 18:01 edsiper

Hello @edsiper, thank you for your response. I am using Fluent Bit version 2.2.0. During the time when Fluent Bit was retrying its connection, I tried to insert data through Postman using the Elasticsearch API and was successful in doing so. Therefore, I believe Elasticsearch was up and running normally at that time. (Note that I have installed Fluent Bit through the Helm chart)

ShahzaibAhmedKhan31 avatar Jan 23 '24 05:01 ShahzaibAhmedKhan31

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar May 01 '24 01:05 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar May 07 '24 01:05 github-actions[bot]

@ShahzaibAhmedKhan31 have you resolved it? I have the same issue right now.

smallc2009 avatar Jun 11 '24 08:06 smallc2009