logging-operator
logging-operator copied to clipboard
Any new version from 4.5.1 doesn't attach extraVolumes in fluentd
Bugs should be filed for issues encountered whilst operating logging-operator. You should first attempt to resolve your issues through the community support channels, e.g. Slack, in order to rule out individual configuration errors. #logging-operator Please provide as much detail as possible.
Describe the bug: A clear and concise description of what the bug is.
I have a logging-operator deployed in a kubernetes Cluster RKE2 with version 4.5.1, when I try to update for a new version like 4.5.3 or 4.5.6 the logs stored in the fluentd never sent to Splunk
Expected behaviour: A concise description of what you expected to happen.
The logs sent to Splunk
Steps to reproduce the bug: Steps to reproduce the bug should be clear and easily reproducible to help people gain an understanding of the problem.
Additional context: Add any other context about the problem here.
Environment details:
-
Kubernetes version (e.g. v1.15.2): RKE2 1.26.11
-
Cloud-provider/provisioner (e.g. AKS, GKE, EKS, PKE etc):
-
logging-operator version (e.g. 2.1.1): 4.5.1
-
Install method (e.g. helm or static manifests): helm
-
Logs from the misbehaving component (and any other relevant logs):
2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1580:in
do_start' 2024-04-26T15:41:19.729694776+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1575:instart' 2024-04-26T15:41:19.729700576+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/net-http-persistent-4.0.2/lib/net/http/persistent.rb:662:instart' 2024-04-26T15:41:19.729705876+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/net-http-persistent-4.0.2/lib/net/http/persistent.rb:602:inconnection_for' 2024-04-26T15:41:19.729711176+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/net-http-persistent-4.0.2/lib/net/http/persistent.rb:892:inrequest' 2024-04-26T15:41:19.729717176+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk_hec.rb:351:inwrite_to_splunk' 2024-04-26T15:41:19.729722476+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk.rb:103:inblock in write' 2024-04-26T15:41:19.729727576+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/benchmark.rb:311:inrealtime' 2024-04-26T15:41:19.729732776+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk.rb:102:inwrite' 2024-04-26T15:41:19.729738076+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk_hec.rb:154:inwrite' 2024-04-26T15:41:19.729763375+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:intry_flush' 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:inflush_thread_run' 2024-04-26T15:41:19.729773975+02:00 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:inblock (2 levels) in start' 2024-04-26 13:41:19 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:inblock in thread_create' -
Resource definition (possibly in YAML format) that caused the issue, without sensitive data:
/kind bug
More info:
failed to flush the buffer. retry_times=3 next_retry_time=2024-04-26 14:57:56 +0000 chunk="61700e0f9e5790e5efb53ae6d92b1e5f" error_class=OpenSSL::SSL::SSLError error="SSL_CTX_load_verify_file: system lib"
I tried to update from 4.5.2 to 4.5.6, when its done and when I see the logs in the fluentd pod I see this error:
error_class=OpenSSL::SSL::SSLError error="SSL_CTX_load_verify_file: system lib"
This is the log:
2024-04-30 09:49:49 +0000 [warn]: #0 [flow:gitlab:gitlab-to-splunk:output:gitlab:splunk-gitlab-dev] failed to flush the buffer. retry_times=9 next_retry_time=2024-04-30 09:58:20 +0000 chunk="6174d053bd6f5921236fadd5329cdb94" error_class=OpenSSL::SSL::SSLError error="SSL_CTX_load_verify_file: system lib" 2024-04-30T11:49:49.169562308+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1666:in initialize'
2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1666:in new' 2024-04-30T11:49:49.169585608+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1666:in connect'
2024-04-30T11:49:49.169600808+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1580:in do_start' 2024-04-30T11:49:49.169608108+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/net/http.rb:1575:in start'
2024-04-30T11:49:49.169615008+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/net-http-persistent-4.0.2/lib/net/http/persistent.rb:662:in start' 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/net-http-persistent-4.0.2/lib/net/http/persistent.rb:602:in connection_for'
2024-04-30T11:49:49.169627208+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/net-http-persistent-4.0.2/lib/net/http/persistent.rb:892:in request' 2024-04-30T11:49:49.169659408+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk_hec.rb:351:in write_to_splunk'
2024-04-30T11:49:49.169682307+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk.rb:103:in block in write' 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/lib/ruby/3.2.0/benchmark.rb:311:in realtime'
2024-04-30T11:49:49.169695207+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk.rb:102:in write' 2024-04-30T11:49:49.169701407+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-splunk-hec-1.3.3/lib/fluent/plugin/out_splunk_hec.rb:154:in write'
2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:in try_flush' 2024-04-30T11:49:49.169713607+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:in flush_thread_run'
2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:in block (2 levels) in start' 2024-04-30T11:49:49.169726507+02:00 2024-04-30 09:49:49 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in block in thread_create'`
I checked the releases notes but any change looks affect to SSL or something else
Good morning,
I got the main problem.
In the actual definition of the Logging I got a extraVolume def created to parse CAs from node host workers:
apiVersion: logging.banzaicloud.io/v1beta1
kind: Logging
metadata:
name: &logging-app-dev gitlab-logging-dev
namespace: cattle-logging-system
spec:
loggingRef: *logging-app-dev
fluentbit:
security:
roleBasedAccessControlCreate: true
fluentd:
security:
roleBasedAccessControlCreate: true
podSecurityContext:
runAsNonRoot: false
scaling:
replicas: 3
bufferStorageVolume:
pvc:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
extraVolumes:
- volumeName: trusted-cas-volume
path: /home/fluent/certs
containerName: fluentd
volume:
hostPath:
path: /etc/pki/ca-trust/source/anchors
controlNamespace: cattle-logging-system
watchNamespaces:
- gitlab`
But when the logging is created this extra volume never create inside fluentd pods.
With the same config in 4.5.1 the extraVolume was created well
I tried to add via FluentdConfig via extraVolumes a hostPath or a Secret and I got the same problem
@frit0-rb can you please use fenced code blocks so that we can see whitespaces as well?
@frit0-rb can you please use fenced code blocks so that we can see whitespaces as well?
Sorry @pepov , I added the fenced
thx, I've started to look into this, but I have some conflicting priorities, I have to ask for your patience
thx, I've started to look into this, but I have some conflicting priorities, I have to ask for your patience
No problem @pepov , we are not fare away from the las stable update, so take it easy
Hello @pepov the a new CVE from fluentbit https://thehackernews.com/2024/05/linguistic-lumberjack-vulnerability.html So I need to resolve the problem as soon as possible because I need to update to 4.6.0
You can use the latest fluentbit anytime without upgrading logging operator by setting the fluentbit image version explocitly
Looking at the code of statefulset.go it seems both Volume and PersistentVolumeClaim must be specified. Not sure why. Also, support for mounting secrets or configmaps is not supported at all. See https://github.com/kube-logging/logging-operator/blob/61e6eb05c56c393cd929d96e66e4c39f346c4882/pkg/resources/fluentd/statefulset.go#L53 These lines were changed 5 months ago.
thx @mgalesloot ! can someone help me verify this fixes the issue? https://github.com/kube-logging/logging-operator/pull/1765
Also this one from @nak0f (coming soon) will extend the support for configmaps: https://github.com/cisco-open/operator-tools/pull/251
fyi I've updated the above PR with a sample that seems to fix this issue as I would expect
Hi @pepov closed this issue means the issue is solved for what version? What version I need to update to use extravolumes?
In the next upcoming version which is going to be 4.8