fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

built in mulitline parser breaks tail input in versions 1.8.12 and newer

Open dschaaff opened this issue 2 years ago • 8 comments

Bug Report

Describe the bug In versions 1.8.12 and newer using the built in multline parser, as described here https://docs.fluentbit.io/manual/pipeline/inputs/tail#multiline-and-containers-v1.8, prevents all logs from shipping out of configured outputs.

I'm using the official helm chart and have tested the bug is present in versions 1.8.12, 1.8.13, 1.9.0, 1.9.1, and 1.9.2

To Reproduce

Deploy the fluent bit kubernetes daemonset using the following config

config:
  service: |
    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level {{ .Values.logLevel }}
        Parsers_File parsers.conf
        Parsers_File custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port {{ .Values.metricsPort }}
        Health_Check On
        Dns.prefer_ipv4 on
  ## https://docs.fluentbit.io/manual/pipeline/inputs
  inputs: |
    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        Path_key log_file
        Refresh_Interval 1
        # multiline.parser docker, cri
        Tag kube.*
        Mem_Buf_Limit 100MB
        Buffer_Chunk_Size 100k
        Buffer_Max_Size 1M
        # storage.type filesystem
        Skip_Long_Lines On
        Skip_Empty_Lines On
  ## https://docs.fluentbit.io/manual/pipeline/outputs
  outputs: |
    [OUTPUT]
        Name stdout
        Match *

Observe that no container logs are sent to stdout.

Expected behavior

It is expected that container logs would appear on stdout. The current workaround is to use the old multiline parser

[INPUT]
        Name tail
        Path /var/log/containers/*.log
        Path_key log_file
        Refresh_Interval 1
        # multiline.parser docker, cri
        Parser docker
        Docker_Mode On
        Docker_Mode_Parser container_firstline
        Tag kube.*
        Mem_Buf_Limit 100MB
        Buffer_Chunk_Size 100k
        Buffer_Max_Size 1M
        storage.type filesystem
        Skip_Long_Lines On
        Skip_Empty_Lines On
        DB /var/lib/fluentbit/pos/kube.pos.db
        DB.journal_mode WAL
        DB.sync normal
        DB.locking true
[PARSER]
        Name                container_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

Your Environment

  • Version used: 1.9.2
  • Configuration:
serviceMonitor:
  enabled: true
  interval: 15s
  scrapeTimeout: 10s
  jobLabel: fluentbit

dnsPolicy: ClusterFirstWithHostNet

resources:
  requests:
    cpu: 100m
    memory: 350Mi
  limits:
    memory: 500Mi

priorityClassName: system-node-critical

hostNetwork: true # required for scraping k8s metadata from kubelet instead of api server

rbac:
  create: true
  nodeAccess: true # require for scraping k8s metadta form kubelet instead of api server

extraVolumes:
  - name: pos
    hostPath:
        path: /var/lib/fluentbit/pos
        type: DirectoryOrCreate
  - name: buffer
    hostPath:
      path: /var/lib/fluentbit/buffer
      type: DirectoryOrCreate

extraVolumeMounts:
  - name: pos
    mountPath: /var/lib/fluentbit/pos
  - name: buffer
    mountPath: /var/lib/fluentbit/buffer

updateStrategy:
  type: RollingUpdate
  rollingUpdate:
    maxUnavailable: 40

config:
  service: |
    [SERVICE]
        Daemon Off
        Flush 1
        Log_Level {{ .Values.logLevel }}
        Parsers_File parsers.conf
        Parsers_File custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port {{ .Values.metricsPort }}
        Health_Check On
        storage.path /var/lib/fluentbit/buffer
        storage.backlog.mem_limit 100M
        storage.sync full
        storage.checksum on
        storage.max_chunks_up 192
        storage.metrics on
        Dns.prefer_ipv4 on
  ## https://docs.fluentbit.io/manual/pipeline/inputs
  inputs: |
    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        Path_key log_file
        Refresh_Interval 1
        # multiline.parser docker, cri
        Parser docker
        Docker_Mode On
        Docker_Mode_Parser container_firstline
        Tag kube.*
        Mem_Buf_Limit 100MB
        Buffer_Chunk_Size 100k
        Buffer_Max_Size 1M
        storage.type filesystem
        Skip_Long_Lines On
        Skip_Empty_Lines On
        DB /var/lib/fluentbit/pos/kube.pos.db
        DB.journal_mode WAL
        DB.sync normal
        DB.locking true
    [INPUT]
        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Mem_Buf_Limit 50MB
        Read_From_Tail On
        storage.type filesystem
        DB /var/lib/fluentbit/pos/systemd.pos.db
  ## https://docs.fluentbit.io/manual/pipeline/filters
  filters: |
    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On
        Kube_Tag_Prefix kube.var.log.containers.
        Annotations false
        Buffer_Size 5MB
        Use_Kubelet true
        Kubelet_Port 10250
    [FILTER]
        Name aws
        Match *
        imds_version v1
        az true
        ec2_instance_id true
        ami_id true
    [FILTER]
        Name modify
        Match *
        Add cluster dev
  ## https://docs.fluentbit.io/manual/pipeline/outputs
  outputs: |
    [OUTPUT]
        Name forward
        Match *
        Host fluentd-internal.fluentd.svc.cluster.local.
        port 24224
        storage.total_limit_size 1000M
        net.keepalive on
        net.keepalive_idle_timeout 30
        net.keepalive_max_recycle 1000
        Workers 1
        Retry_Limit 1000
  ## https://docs.fluentbit.io/manual/pipeline/parsers
  customParsers: |
    [PARSER]
        Name docker_no_time
        Format json
        Time_Keep Off
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
    [PARSER]
        Name                container_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

  • Environment name and version (e.g. Kubernetes? What version?):

AWS EKS Kubernetes Version 1.22 using the official fluent bit docker image.

dschaaff avatar Apr 18 '22 23:04 dschaaff

This issue is still present in 1.9.3.

dschaaff avatar Apr 27 '22 23:04 dschaaff

My test results are the same as yours #5245

chenlingmin avatar Apr 29 '22 06:04 chenlingmin

@dschaaff I just upgraded to version 1.9.3, my previous Docker_Mode configuration is not working, it means that I'm getting the logs line by line, the expected behavior for me is get all the multiline logs combined in the same record, when I tried the approach mentioned in the fluent-bit documentation (multiline.parser docker, cri) I got the same result, were you finally able to fix that?

shake76 avatar May 10 '22 17:05 shake76

@shake76 I'll have test that out to confirm. I only verified that the builtin multi-line parser is not working in 1.9.3.

dschaaff avatar May 11 '22 02:05 dschaaff

I'm having similar issues with the multiline and tail. I found that if there is configuration for the tail input that modifies the record (Path_Key, Offset_Key, Key) then it behaves incorrectly. We're stuck on 1.8.15 where I managed to get it working by dropping Path_Key, Offset_Key, Key from the configuration which is less than ideal.

mbystedt avatar May 26 '22 18:05 mbystedt

I tested and can confirm this is still broken in 1.9.7.

mbystedt avatar Aug 10 '22 23:08 mbystedt

It can be reproduced with 1.9.7 and multiline example from https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/multiline-parsing

375gnu avatar Aug 18 '22 13:08 375gnu

I tested and can confirm this is still broken in 1.9.8.

mbystedt avatar Sep 13 '22 16:09 mbystedt

Hello @dschaaff the fix for this issue proposed by @BinaryFissionGames in PR https://github.com/fluent/fluent-bit/issues/6240#event-7644118753, was merged into the master branch and will be released in Fluent-bit v2.0.0.

You can check these articles to test Fluent-Bit v2.0 which includes this fix, but as this is not an official version yet, it is not intended for a production environment.

https://docs.fluentbit.io/manual/installation/sources/download-source-code https://docs.fluentbit.io/manual/v/2.0-pre/installation/sources/build-and-install

Please note: the master branch will be our next release v2.0.0, you can also test it with an unofficial image https://github.com/fluent/fluent-bit/tree/master/dockerfiles#ghcrio-topology

Ricardo.

RicardoAAD avatar Oct 24 '22 14:10 RicardoAAD

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Jan 23 '23 02:01 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Jan 28 '23 02:01 github-actions[bot]