logging-operator
logging-operator copied to clipboard
got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)"
Describe the bug: Error when using syslog output
Expected behaviour: Logs should be sent to defined syslog cluster output
Steps to reproduce the bug: Configure below resource
apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterOutput
metadata:
name: syslog
namespace: logging
spec:
syslog:
buffer:
timekey: 30s
timekey_wait: 0s
host: syslog.example.net
insecure: true
port: 20444
transport: tls
apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterFlow
metadata:
name: hosttailer-flow
namespace: logging
spec:
filters:
- tag_normaliser: {}
globalOutputRefs:
- syslog
match:
- select:
labels:
app.kubernetes.io/name: host-tailer
Additional context: Fluentd throws errors:
2024-04-05 11:29:00 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)"
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/socket.rb:41:in `socket_create'
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:65:in `find_or_create_socket'
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:39:in `write'
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:in `try_flush'
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
2024-04-05 11:29:00 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2024-04-05 11:29:00 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] bad chunk is moved to /buffers/backup/worker0/clusterflow_logging_hosttailer-flow_clusteroutput_logging_syslog/61557c1e8b4b20b9380467be5ff0a45b.log
2024-04-05 11:29:01 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] got unrecoverable error in primary and no secondary error_class=ArgumentError error="wrong number of arguments (given 4, expected 3)"
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/socket.rb:41:in `socket_create'
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:65:in `find_or_create_socket'
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluent-plugin-syslog_rfc5424-0.9.0.rc.8/lib/fluent/plugin/out_syslog_rfc5424.rb:39:in `write'
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1225:in `try_flush'
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
2024-04-05 11:29:01 +0000 [warn]: #0 /usr/local/bundle/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2024-04-05 11:29:01 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] bad chunk is moved to /buffers/backup/worker0/clusterflow_logging_hosttailer-flow_clusteroutput_logging_syslog/61557c20915175b74f5d02915b7386cb.log
Environment details:
- Kubernetes version 1.27
- Cloud-provider/provisioner : AKS
- logging-operator version : 4.6.0
- Install method (e.g. helm or static manifests): helm
- Logs from the misbehaving component (and any other relevant logs):
- Resource definition (possibly in YAML format) that caused the issue, without sensitive data:
/kind bug
@kefiras this error message alone doesn't tell much about the original problem
- have you tried looking at the referred bad chunk?
2024-04-05 11:29:01 +0000 [warn]: #0 [clusterflow:logging:hosttailer-flow:clusteroutput:logging:syslog] bad chunk is moved to /buffers/backup/worker0/clusterflow_logging_hosttailer-flow_clusteroutput_logging_syslog/61557c20915175b74f5d02915b7386cb.log
- have you tried raising the log level? (
logLevel: debugin fluentd spec) - have you/can you check the error/warning messages on the receiving side if there were any?
Debug is already enabled
bad chunk
??f?.FsN??time?2024-04-09T10:18:22.776368974Z?message?:Apr 9 10:18:22 aks-prometheus-18130450-vmss000000 kernel: [498058.497065] calico-packet: IN=azve56f4c00502 OUT=azva623c2d61aa MAC=aa:aa:aa:aa:aa:aa:6a:73:f2:79:14:75:08:00 SRC=10.244.3.144 DST=10.244.3.135 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=35709 DF PROTO=TCP SPT=36200 DPT=2020 WINDOW=64240 RES=0x00 SYN URGP=0 ?app?host-tailer?container_image?Lrepo-aks.qa.example.net/example/linux/exm/exm/vendor/fluent/fluent-bit:2.1.8?clustername?aks1kexm1?datacenter?eastus2?env?nonprod?family?logging?mnemonic?exm?hostname?"aks-prometheus-18130450-vmss000000?namespace?logging?pod_id?$2461060d-4eb9-41ec-8fe2-eefcf4bad090?pod_name?filetail-host-tailer-phq7s?service?syslog/ $
I haven't checked receiving side but I doubt anything is send
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions!
We encountered the same error. Is it possible to open this issue again?
@liz-86 can you add some details to this? do you see this error with the latest image versions as well?
Yes, we tested our configuration (much the same as the above mentioned but with tcp transport and not tls) with the latest fluentd image (kube-logging/fluentd-images:v1.16-full). Our ClusterOutput:
apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterOutput
metadata:
name: syslog
namespace: logging
spec:
syslog:
buffer:
flush_thread_count: 16
timekey: 1m
timekey_use_utc: true
timekey_wait: 30s
format:
type: json
host: syslog.example.net
insecure: true
port: 5056
transport: tcp
The created fluentd.conf is the following (from k8s secret loggging-operator-logging-fluentd-app):
<match **>
@type syslog_rfc5424
@id clusterflow:logging:syslog-flow:clusteroutput:logging:syslog-output
host syslog.example.net
insecure true
port 5056
transport tcp
<buffer tag,time>
@type file
chunk_limit_size 8MB
flush_thread_count 16
path /buffers/clusterflow:logging:syslog-flow:clusteroutput:logging:syslog-output.*.buffer
retry_forever true
timekey 1m
timekey_use_utc true
timekey_wait 30s
</buffer>
<format>
@type json
</format>
</match>
Same issue here.
Provider: RKE2 Kubernetes Version: v1.27.12 +rke2r1 Chart: Logging (103.1.1+up4.4.0)
What are your fluentd and fluentbit image versions?
It seems I totally misunderstood the issue originally. I've looked at it once again and it seems that the ruby3 upgrade broke the syslog plugin because of the deprecation and removal of https://blog.saeloun.com/2019/10/07/ruby-2-7-keyword-arguments-redesign/
I've made a change here: https://github.com/pepov/fluent-plugin-syslog_rfc5424/commit/6404b617bc8d5ddd9cf4628cb601cf9b4718e7fb
Then applied on my fork of the fluentd image here: https://github.com/kube-logging/fluentd-images/compare/main...pepov:fluentd-images:main
I didn't have the time to test it with a syslog receiver, could you please give it a try with ghcr.io/pepov/fluentd:v1.16-full?
Thanks for looking into the issue. I can confirm that with the new image there are no more errors in the fluentd. I need to talk to another team to see if there are getting the desired logs. But it looks good at the moment.
Thanks again!
EDIT: All seems to be working perfectly. The other team's are getting logs. :)
thx for the confirmation, I'm making the PRs to have the fix released asap
The images have been updated with the fix with the 148th build: v1.16-full-build.148 v1.16-full
For logging operator 4.8: v1.16-4.8-full-build.148 v1.16-4.8-full