fluent-bit
fluent-bit copied to clipboard
[output:s3:s3.0] PutObject request failed
Bug Report
I am getting the following error intermittently in my Fluent bit Logs, due to which it is failing to upload the log files to the S3 bucket. Full error log:
[2024/06/12 09:38:19] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:38:19] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:38:19] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:38:19] [error] [output:s3:s3.0] Could not send chunk with tag kube-system
[2024/06/12 09:38:20] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:38:20] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:38:20] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:38:50] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:38:50] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:38:50] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:39:10] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:39:10] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:39:10] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:39:30] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:39:30] [error] [tls] error: error:00000006:lib(0):func(0):EVP lib
[2024/06/12 09:39:30] [error] [/src/fluent-bit/src/flb_http_client.c:1231 errno=32] Broken pipe
[2024/06/12 09:39:30] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:40:10] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:40:10] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:40:10] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:40:30] [error] [/src/fluent-bit/src/tls/openssl.c:495 errno=32] Broken pipe
[2024/06/12 09:40:30] [error] [tls] syscall error: error:00000005:lib(0):func(0):DH lib
[2024/06/12 09:40:30] [error] [/src/fluent-bit/src/flb_http_client.c:1241 errno=32] Broken pipe
[2024/06/12 09:41:20] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:41:20] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:41:20] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:42:00] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:42:00] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:42:00] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:43:00] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:10] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:10] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:10] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:43:59] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:59] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:59] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:43:59] [error] [output:s3:s3.0] Could not send chunk with tag 105250-borrow
[2024/06/12 09:43:59] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:59] [error] [http_client] broken connection to s3.eu-west-1.amazonaws.com:443 ?
[2024/06/12 09:43:59] [error] [output:s3:s3.0] PutObject request failed
[2024/06/12 09:43:59] [error] [output:s3:s3.0] Could not send chunk with tag logging
[2024/06/12 09:44:00] [error] [tls] error: error:00000006:lib(0):func(0):EVP lib
[2024/06/12 09:44:00] [error] [/src/fluent-bit/src/flb_http_client.c:1241 errno=104] Connection reset by peer
[2024/06/12 09:44:00] [error] [tls] error: error:00000001:lib(0):func(0):reason(1)
[2024/06/12 09:44:00] [error] [output:s3:s3.0] PutObject request failed
I am using Fluent-bit version 2.2.0 and is configured as:
Fluent-Bit configurations:
config:
service: |
[SERVICE]
Flush 10
Log_Level info
Daemon off
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
Health_Check On
storage.path /var/log/fluent-bit-buffer
storage.sync full
storage.metrics on
storage.delete_irrecoverable_chunks on
storage.max_chunks_up 1000
storage.backlog.mem_limit 300Mi
## https://docs.fluentbit.io/manual/pipeline/inputs
inputs: |
[INPUT]
Name tail
Tag kube.*
storage.type filesystem
Path /var/log/containers/*.log
multiline.parser cri, docker
DB /var/log/flb_kube.db
DB.locking true
Buffer_Chunk_Size 1MB
Buffer_Max_Size 1MB
Skip_Long_Lines On
Refresh_Interval 5
## https://docs.fluentbit.io/manual/pipeline/filters
filters: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /<path_to_those_files>
Kube_Token_File /<path_to_those_files>
Merge_Log On
Merge_Log_Key log4j
Cache_Use_Docker_Id On
Keep_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On
Labels On
Annotations Off
Buffer_Size 0
[FILTER]
Name rewrite_tag
Match kube.*
Rule $kubernetes['namespace_name'] ^.*$ $kubernetes['namespace_name'] false
Emitter_Name ns_emitter
Emitter_Storage.type filesystem
## https://docs.fluentbit.io/manual/pipeline/outputs
outputs: |
[OUTPUT]
Name s3
Match *
bucket <bucket_name>
region eu-west-1
upload_timeout 1m
total_file_size 5M
s3_key_format /account_name/%Y/%m/%d/%H/$TAG/$UUID.gz
s3_key_format_tag_delimiters .-
canned_acl bucket-owner-full-control
use_put_object true
compression gzip
store_dir_limit_size 10G
retry_limit 5
Resources allocated to Fluent-Bit pods:
resources:
limits:
memory: 300Mi
requests:
cpu: 400m
memory: 300Mi
I have recently set the retry_limit to 5 as it was set to 1 by default
Edit: The update of retry_limit didn't help. Edit: Cannot attach the snapshot of the metrics here, but the improvement seen is around 90%
It's not uniform but each fluent-bit pod opens around 1200 files in an hour (output of fluentbit_input_files_opened_total{})
The infrastructure is not broken, I can see most of the logs. But it's just that a few logs are missed while Fluent-bit suffers from the issue. I saw the same issue raised in the past, I cannot see anyone being concluded to a solution or a fix. Is there something I need to configure or is it an ongoing issue?
Please let me know if there is any more information required for the resolution.
Can you try with the latest version 3.0.7?
Sure can! However, I couldn't see any fix in the later version for the S3 plugin.
It may be TLS related though looking at the stack, plus always best to try with the latest version anyway to confirm.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
This issue was closed because it has been stalled for 5 days with no activity.