fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

New multiline parser doesn't work as expected

Open ogarcia opened this issue 3 years ago • 16 comments

Bug Report

Comes from #3500

Describe the bug I'm trying to use new multiline parser to parse logs included in #3500 here is a fragment of a multiline event.

2021-05-17T17:35:01.184675702Z stdout F [DEBUG] THIS IS A SIMPLE LINE (THE NEXT IS THE MULTILINE EVENT)
2021-05-17T17:35:01.184675702Z stdout F [DEBUG] p.s.a.o.a.n.h.HttpHandler in AsyncHttpClient-2-2  - 
2021-05-17T17:35:01.184734407Z stdout F 
2021-05-17T17:35:01.184747208Z stdout F Request DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
2021-05-17T17:35:01.184757609Z stdout F GET /v2/auth/user HTTP/1.1
2021-05-17T17:35:01.184777011Z stdout F Authorization: Bearer AUTH
2021-05-17T17:35:01.184809513Z stdout F X-Correlation-ID: 5090
2021-05-17T17:35:01.184819114Z stdout F Host: example.com
2021-05-17T17:35:01.184828415Z stdout F Accept: */*
2021-05-17T17:35:01.184836615Z stdout F User-Agent: AHC/2.0
2021-05-17T17:35:01.184844716Z stdout F 
2021-05-17T17:35:01.184852417Z stdout F Response DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
2021-05-17T17:35:01.184861017Z stdout F HTTP/1.1 200 OK
2021-05-17T17:35:01.184870618Z stdout F Access-Control-Allow-Credentials: true
2021-05-17T17:35:01.184878619Z stdout F Content-Security-Policy: default-src https:
2021-05-17T17:35:01.18488642Z stdout F Content-Type: application/json; charset=utf-8
2021-05-17T17:35:01.18489482Z stdout F Date: Mon, 17 May 2021 17:35:01 GMT
2021-05-17T17:35:01.184903121Z stdout F Etag: W/"TAG/hs"
2021-05-17T17:35:01.184911322Z stdout F Referrer-Policy: no-referrer
2021-05-17T17:35:01.184918922Z stdout F Server: 4P
2021-05-17T17:35:01.184927623Z stdout F Strict-Transport-Security: max-age=315360000; includeSubDomains; preload
2021-05-17T17:35:01.184935324Z stdout F Vary: Origin
2021-05-17T17:35:01.184942724Z stdout F X-Content-Type-Options: nosniff
2021-05-17T17:35:01.184950825Z stdout F X-Correlator: fd945034-627e-4c2e-b8d6-9368fd71feb7
2021-05-17T17:35:01.184959026Z stdout F X-Frame-Options: SAMEORIGIN
2021-05-17T17:35:01.184967226Z stdout F X-Xss-Protection: 1; mode=block
2021-05-17T17:35:01.184976127Z stdout F Transfer-Encoding: chunked
2021-05-17T17:35:01.184675702Z stdout F [DEBUG] THIS IS A NEW SIMPLE LINE

And my config is:

    [INPUT]
        Name                tail
        Tag                 kube.*
        Path                /var/log/containers/appname*app*.log
        DB                  /var/log/kube_app.db
        Mem_Buf_Limit       200MB
        Skip_Long_Lines     Off
        Refresh_Interval    10
        multiline.parser    multiline_app

    [FILTER]
        Name                kubernetes
        Match               kube.*app*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        Merge_Parser        appname_app
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off

    [MULTILINE_PARSER]
        Name           multiline_app
        Type           regex
        Flush_Timeout  5000
        #
        # Regex rules for multiline parsing
        # ---------------------------------
        #
        # configuration hints:
        #
        #  - first state always has the name: start_state
        #  - every field in the rule must be inside double quotes
        #
        # rules |   state name  | regex pattern                                                                    | next state
        # ------|---------------|----------------------------------------------------------------------------------------------
        rule      "start_state"   "^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>\[[^\]]+\].*)$"  "cont"
        rule      "cont"          "^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$"            "cont"

    [PARSER]
        Name        appname_app
        Format      regex
        Regex       \[(?<level>[^\]]+)\] (?<logger>[^ ]+) in (?:(?<thread_name>.+?)(?:-(?<thread_id>[\d-]+))?) (?:(?<marker_name>[^= ]+)=(?<marker_id>[^ ]+))? ?- (?<message>(.|\n|\r)*)

As you can see, the multiline parser is very simple, if the line has not format 2021-05-17T17:35:01.184675702Z stdout F [LEVEL] XXXXXXX is a continuation of multiline, but this don't work. Simply picks every line as new event.

Your Environment

  • Version used: 1.8.2

ogarcia avatar Jul 21 '21 08:07 ogarcia

I gave it a try and indeed found a problem as you reported. The problem appears on my case with filter multiline. Simple message concatenation on tail works fine (I've adjusted the multiline rules):

[0] tail.0: [0.000000000, {"log"=>"[DEBUG] THIS IS A SIMPLE LINE (THE NEXT IS THE MULTILINE EVENT)
"}]
[1] tail.0: [1626879674.165296408, {"log"=>"[DEBUG] p.s.a.o.a.n.h.HttpHandler in AsyncHttpClient-2-2  - 
 
Request DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
GET /v2/auth/user HTTP/1.1
Authorization: Bearer AUTH
X-Correlation-ID: 5090
Host: example.com
Accept: */*
User-Agent: AHC/2.0
 
Response DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Content-Security-Policy: default-src https:
Content-Type: application/json; charset=utf-8
Date: Mon, 17 May 2021 17:35:01 GMT
Etag: W/"TAG/hs"
Referrer-Policy: no-referrer
Server: 4P
Strict-Transport-Security: max-age=315360000; includeSubDomains; preload
Vary: Origin
X-Content-Type-Options: nosniff
X-Correlator: fd945034-627e-4c2e-b8d6-9368fd71feb7
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
Transfer-Encoding: chunked
"}]
[2] tail.0: [1626879674.165296408, {"log"=>"[DEBUG] THIS IS A NEW SIMPLE LINE
"}]

I will keep troubleshooting...

edsiper avatar Jul 21 '21 15:07 edsiper

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Aug 21 '21 01:08 github-actions[bot]

Bump!

ogarcia avatar Aug 21 '21 09:08 ogarcia

I have exactly the same problem!

danilnich avatar Aug 30 '21 13:08 danilnich

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Oct 01 '21 01:10 github-actions[bot]

Bump!

ogarcia avatar Oct 04 '21 12:10 ogarcia

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Nov 05 '21 01:11 github-actions[bot]

Bump!

ogarcia avatar Nov 05 '21 07:11 ogarcia

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Dec 10 '21 02:12 github-actions[bot]

Bump!

ogarcia avatar Dec 10 '21 09:12 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Mar 15 '22 02:03 github-actions[bot]

Bump!

ogarcia avatar Mar 15 '22 07:03 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Jun 14 '22 02:06 github-actions[bot]

Bump!

ogarcia avatar Jun 14 '22 06:06 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Sep 14 '22 02:09 github-actions[bot]

Bump!

ogarcia avatar Sep 14 '22 08:09 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Dec 15 '22 02:12 github-actions[bot]

Bump!

ogarcia avatar Dec 15 '22 07:12 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Mar 16 '23 02:03 github-actions[bot]

Bump!

ogarcia avatar Mar 16 '23 07:03 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Jun 15 '23 02:06 github-actions[bot]

Bump!

ogarcia avatar Jun 15 '23 06:06 ogarcia

Could you please provide us with an update of the status of this issue using fluent-bit 2.1? From what I see in this thread it should be fixed but the bumps suggest otherwise.

leonardo-albertovich avatar Jun 15 '23 06:06 leonardo-albertovich

The bug is the same as described in the first comment, only now when testing what happens is that instead of extracting each line as a new event, it takes the whole block as a new event (it seems to ignore the multiline filter).

Tested with fluentbit version=2.1.4, commit=4854f38c7c

ogarcia avatar Jun 15 '23 08:06 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Sep 15 '23 01:09 github-actions[bot]

Bump!

ogarcia avatar Sep 15 '23 08:09 ogarcia

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Dec 16 '23 01:12 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Dec 22 '23 01:12 github-actions[bot]