fluent-bit
fluent-bit copied to clipboard
New multiline parser doesn't work as expected
Bug Report
Comes from #3500
Describe the bug I'm trying to use new multiline parser to parse logs included in #3500 here is a fragment of a multiline event.
2021-05-17T17:35:01.184675702Z stdout F [DEBUG] THIS IS A SIMPLE LINE (THE NEXT IS THE MULTILINE EVENT)
2021-05-17T17:35:01.184675702Z stdout F [DEBUG] p.s.a.o.a.n.h.HttpHandler in AsyncHttpClient-2-2 -
2021-05-17T17:35:01.184734407Z stdout F
2021-05-17T17:35:01.184747208Z stdout F Request DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
2021-05-17T17:35:01.184757609Z stdout F GET /v2/auth/user HTTP/1.1
2021-05-17T17:35:01.184777011Z stdout F Authorization: Bearer AUTH
2021-05-17T17:35:01.184809513Z stdout F X-Correlation-ID: 5090
2021-05-17T17:35:01.184819114Z stdout F Host: example.com
2021-05-17T17:35:01.184828415Z stdout F Accept: */*
2021-05-17T17:35:01.184836615Z stdout F User-Agent: AHC/2.0
2021-05-17T17:35:01.184844716Z stdout F
2021-05-17T17:35:01.184852417Z stdout F Response DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
2021-05-17T17:35:01.184861017Z stdout F HTTP/1.1 200 OK
2021-05-17T17:35:01.184870618Z stdout F Access-Control-Allow-Credentials: true
2021-05-17T17:35:01.184878619Z stdout F Content-Security-Policy: default-src https:
2021-05-17T17:35:01.18488642Z stdout F Content-Type: application/json; charset=utf-8
2021-05-17T17:35:01.18489482Z stdout F Date: Mon, 17 May 2021 17:35:01 GMT
2021-05-17T17:35:01.184903121Z stdout F Etag: W/"TAG/hs"
2021-05-17T17:35:01.184911322Z stdout F Referrer-Policy: no-referrer
2021-05-17T17:35:01.184918922Z stdout F Server: 4P
2021-05-17T17:35:01.184927623Z stdout F Strict-Transport-Security: max-age=315360000; includeSubDomains; preload
2021-05-17T17:35:01.184935324Z stdout F Vary: Origin
2021-05-17T17:35:01.184942724Z stdout F X-Content-Type-Options: nosniff
2021-05-17T17:35:01.184950825Z stdout F X-Correlator: fd945034-627e-4c2e-b8d6-9368fd71feb7
2021-05-17T17:35:01.184959026Z stdout F X-Frame-Options: SAMEORIGIN
2021-05-17T17:35:01.184967226Z stdout F X-Xss-Protection: 1; mode=block
2021-05-17T17:35:01.184976127Z stdout F Transfer-Encoding: chunked
2021-05-17T17:35:01.184675702Z stdout F [DEBUG] THIS IS A NEW SIMPLE LINE
And my config is:
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/appname*app*.log
DB /var/log/kube_app.db
Mem_Buf_Limit 200MB
Skip_Long_Lines Off
Refresh_Interval 10
multiline.parser multiline_app
[FILTER]
Name kubernetes
Match kube.*app*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
Merge_Parser appname_app
K8S-Logging.Parser On
K8S-Logging.Exclude Off
[MULTILINE_PARSER]
Name multiline_app
Type regex
Flush_Timeout 5000
#
# Regex rules for multiline parsing
# ---------------------------------
#
# configuration hints:
#
# - first state always has the name: start_state
# - every field in the rule must be inside double quotes
#
# rules | state name | regex pattern | next state
# ------|---------------|----------------------------------------------------------------------------------------------
rule "start_state" "^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>\[[^\]]+\].*)$" "cont"
rule "cont" "^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<log>.*)$" "cont"
[PARSER]
Name appname_app
Format regex
Regex \[(?<level>[^\]]+)\] (?<logger>[^ ]+) in (?:(?<thread_name>.+?)(?:-(?<thread_id>[\d-]+))?) (?:(?<marker_name>[^= ]+)=(?<marker_id>[^ ]+))? ?- (?<message>(.|\n|\r)*)
As you can see, the multiline parser is very simple, if the line has not format 2021-05-17T17:35:01.184675702Z stdout F [LEVEL] XXXXXXX is a continuation of multiline, but this don't work. Simply picks every line as new event.
Your Environment
- Version used: 1.8.2
I gave it a try and indeed found a problem as you reported. The problem appears on my case with filter multiline. Simple message concatenation on tail works fine (I've adjusted the multiline rules):
[0] tail.0: [0.000000000, {"log"=>"[DEBUG] THIS IS A SIMPLE LINE (THE NEXT IS THE MULTILINE EVENT)
"}]
[1] tail.0: [1626879674.165296408, {"log"=>"[DEBUG] p.s.a.o.a.n.h.HttpHandler in AsyncHttpClient-2-2 -
Request DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: EmptyByteBufBE)
GET /v2/auth/user HTTP/1.1
Authorization: Bearer AUTH
X-Correlation-ID: 5090
Host: example.com
Accept: */*
User-Agent: AHC/2.0
Response DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Content-Security-Policy: default-src https:
Content-Type: application/json; charset=utf-8
Date: Mon, 17 May 2021 17:35:01 GMT
Etag: W/"TAG/hs"
Referrer-Policy: no-referrer
Server: 4P
Strict-Transport-Security: max-age=315360000; includeSubDomains; preload
Vary: Origin
X-Content-Type-Options: nosniff
X-Correlator: fd945034-627e-4c2e-b8d6-9368fd71feb7
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
Transfer-Encoding: chunked
"}]
[2] tail.0: [1626879674.165296408, {"log"=>"[DEBUG] THIS IS A NEW SIMPLE LINE
"}]
I will keep troubleshooting...
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Bump!
I have exactly the same problem!
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Bump!
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Bump!
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
Could you please provide us with an update of the status of this issue using fluent-bit 2.1? From what I see in this thread it should be fixed but the bumps suggest otherwise.
The bug is the same as described in the first comment, only now when testing what happens is that instead of extracting each line as a new event, it takes the whole block as a new event (it seems to ignore the multiline filter).
Tested with fluentbit version=2.1.4, commit=4854f38c7c
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
Bump!
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
This issue was closed because it has been stalled for 5 days with no activity.