fluent-bit
fluent-bit copied to clipboard
Multiline feature with built-in CRI parser does not separate streams
Bug Report
Describe the bug
When a container application produces log messages that are split by the container runtime into multiple parts, and there are log messages written to stdout and stderr more or less at the same time, it is possible that Fluent Bit does not concatenate those parts.
To Reproduce
- Example log message:
2021-11-18T14:47:34.730832241+00:00 stdout F some short log message A
2021-11-18T14:47:34.730832241+00:00 stdout P some longer log message B part one
2021-11-18T14:47:34.730908438+00:00 stdout F end of the longer message B
2021-11-18T14:47:34.730908438+00:00 stdout P another longer log message C part one
2021-11-18T14:47:34.730939216+00:00 stdout P message C continued in part two
2021-11-18T14:47:34.735454549+00:00 stderr F some log message D in stderr part one
2021-11-18T14:47:34.735482854+00:00 stdout F message C still continued after a line from a different stream
2021-11-18T14:47:34.735482854+00:00 stderr F and the end of the stderr message D
- Steps to reproduce the problem:
Create an application that writes a couple of long lines to stdout and stderr and run in a CRI-O runtime (e.g. Openshift Kubernetes cluster).
Expected behavior
The output (in our case Elastic) shows each log message complete as a separate document.
Your Environment
- Version used: container image fluent/fluent-bit:1.8.9
- Configuration: relevant part:
[INPUT] Name tail Path /var/log/containers/*.log Multiline.Parser cri
- Environment name and version: Kubernetes 1.20 / Openshift 4.7.37
- Server type and version:
- Operating System and version: RHEL 47.84, CRI-O 1.20.5-7
- Filters and plugins:
Additional context
- Lines have an indication in field 3: F for a one-line message and for the concluding line of a multi-line message; P for parts other than the final part of a multi-line message.
- Log messages from different streams (stdout, stderr) can be mixed up (examples C and D).
- Log messages can be in JSON and we also apply the JSON parser as filter.
- Messages which are not recombined by Fluent Bit are hard to find in Elastic and in case they are formatted as JSON, the parts can obviously not be interpreted as JSON, so we are missing a lot of information in Elastic for those lines.
Note that this is not about the multiline filter (see e.g. #4309), nor about the old multiline parser (see Tail - Multiline).
@edsiper IMO, this is a pretty major miss in the cri
parser.
@edsiper @PettitWesley any idea when a fix can be implemented?
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale
label.
I take it that the "pretty major miss" is still open?
This definitely needs to be fixed @agup006 @edsiper, this is a fundamental thing users expect Fluent Bit to just get right/work
I thought we had added the partial message support in the initial version of cri parser, adding @lecaros - would you be able to help us schedule this with other multiline work?
@agup006 @lecaros partial message support for CRI log files exists but here is reported a bug: lines from stdout and stderr are not properly separated
I want to note that fixing this is separate from the work that I did in the multiline filter to add buffered mode and partial_message support.
The work here is net new and different. Though the use case/problem statement is similar- split messages from a container runtime need to be rejoined intelligently based on context (like whether they come from stdout or stderr). My work was to serve folks who are mainly using the fluentd docker log driver to send logs to Fluent Bit. This is CRI container log files which is similar and different.
This can be easily accomplished with a Lua filter:
local buffer = {
stdout = {
text = {}
},
stderr = {
text = {}
}
}
function cb_filter(tag, ts, record)
local _, _, date, stream, record_type, line = record.log:find('^(%S+)%s+(%S+)%s+([PF])%s+(.+)$')
table.insert(buffer[stream].text, line)
if #(buffer[stream].text) == 1 then
-- store the date extracted from the first line
buffer[stream].date = date
end
if record_type == 'P' then
-- not finished
return -1
end
-- flush the buffered message
record = {
date = buffer[stream].date,
stream = stream,
log = table.concat(buffer[stream].text, '\n')
}
buffer[stream].text={}
return 1, ts, record
end
Test on fluent-bit lua playground. Sample input:
{"log":"2021-11-18T14:47:34.730832241+00:00 stdout F some short log message A"}
{"log":"2021-11-18T14:47:34.730832241+00:00 stdout P some longer log message B part one"}
{"log":"2021-11-18T14:47:34.730908438+00:00 stdout F end of the longer message B"}
{"log":"2021-11-18T14:47:34.730908438+00:00 stdout P another longer log message C part one"}
{"log":"2021-11-18T14:47:34.730939216+00:00 stdout P message C continued in part two"}
{"log":"2021-11-18T14:47:34.735454549+00:00 stderr F some log message D in stderr part one"}
{"log":"2021-11-18T14:47:34.735482854+00:00 stdout F message C still continued after a line from a different stream"}
{"log":"2021-11-18T14:47:34.735482854+00:00 stderr F and the end of the stderr message D"}
Output:
[0,{"date":"2021-11-18T14:47:34.730832241+00:00","stream":"stdout","log":"some short log message A"}]
[0,{"date":"2021-11-18T14:47:34.730832241+00:00","stream":"stdout","log":"some longer log message B part one\nend of the longer message B"}]
[0,{"date":"2021-11-18T14:47:34.735454549+00:00","stream":"stderr","log":"some log message D in stderr part one"}]
[0,{"date":"2021-11-18T14:47:34.730908438+00:00","stream":"stdout","log":"another longer log message C part one\nmessage C continued in part two\nmessage C still continued after a line from a different stream"}]
[0,{"date":"2021-11-18T14:47:34.735482854+00:00","stream":"stderr","log":"and the end of the stderr message D"}]
Note that this example does not consider multiple files. If grouping streams is required, then per file/tag buffers would need to be used