fluent-bit Multiline feature with built-in CRI parser does not separate streams

Multiline feature with built-in CRI parser does not separate streams

Open mdraijer opened this issue 2 years ago • 32 comments

Bug Report

Describe the bug

When a container application produces log messages that are split by the container runtime into multiple parts, and there are log messages written to stdout and stderr more or less at the same time, it is possible that Fluent Bit does not concatenate those parts.

To Reproduce

Example log message:

2021-11-18T14:47:34.730832241+00:00 stdout F some short log message A
2021-11-18T14:47:34.730832241+00:00 stdout P some longer log message B part one
2021-11-18T14:47:34.730908438+00:00 stdout F end of the longer message B
2021-11-18T14:47:34.730908438+00:00 stdout P another longer log message  C part one
2021-11-18T14:47:34.730939216+00:00 stdout P message C continued in part two
2021-11-18T14:47:34.735454549+00:00 stderr F some log message D in stderr part one
2021-11-18T14:47:34.735482854+00:00 stdout F message C still continued after a line from a different stream
2021-11-18T14:47:34.735482854+00:00 stderr F and the end of the stderr message D

Steps to reproduce the problem:

Create an application that writes a couple of long lines to stdout and stderr and run in a CRI-O runtime (e.g. Openshift Kubernetes cluster).

Expected behavior

The output (in our case Elastic) shows each log message complete as a separate document.

Your Environment

Version used: container image fluent/fluent-bit:1.8.9

Configuration: relevant part:

  [INPUT]
      Name tail
      Path /var/log/containers/*.log
      Multiline.Parser cri

Environment name and version: Kubernetes 1.20 / Openshift 4.7.37
Server type and version:
Operating System and version: RHEL 47.84, CRI-O 1.20.5-7
Filters and plugins:

Additional context

Lines have an indication in field 3: F for a one-line message and for the concluding line of a multi-line message; P for parts other than the final part of a multi-line message.
Log messages from different streams (stdout, stderr) can be mixed up (examples C and D).
Log messages can be in JSON and we also apply the JSON parser as filter.
Messages which are not recombined by Fluent Bit are hard to find in Elastic and in case they are formatted as JSON, the parts can obviously not be interpreted as JSON, so we are missing a lot of information in Elastic for those lines.

Note that this is not about the multiline filter (see e.g. #4309), nor about the old multiline parser (see Tail - Multiline).

Dec 02 '21 12:12 mdraijer

@edsiper IMO, this is a pretty major miss in the cri parser.

Dec 03 '21 00:12 PettitWesley

@edsiper @PettitWesley any idea when a fix can be implemented?

Dec 13 '21 12:12 mdraijer

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

Mar 26 '22 09:03 github-actions[bot]

I take it that the "pretty major miss" is still open?

Mar 28 '22 06:03 mdraijer

This definitely needs to be fixed @agup006 @edsiper, this is a fundamental thing users expect Fluent Bit to just get right/work

Mar 28 '22 20:03 PettitWesley

I thought we had added the partial message support in the initial version of cri parser, adding @lecaros - would you be able to help us schedule this with other multiline work?

Mar 28 '22 20:03 agup006

@agup006 @lecaros partial message support for CRI log files exists but here is reported a bug: lines from stdout and stderr are not properly separated

Mar 28 '22 21:03 PettitWesley

I want to note that fixing this is separate from the work that I did in the multiline filter to add buffered mode and partial_message support.

The work here is net new and different. Though the use case/problem statement is similar- split messages from a container runtime need to be rejoined intelligently based on context (like whether they come from stdout or stderr). My work was to serve folks who are mainly using the fluentd docker log driver to send logs to Fluent Bit. This is CRI container log files which is similar and different.

Sep 19 '22 20:09 PettitWesley

This can be easily accomplished with a Lua filter:

local buffer = {
  stdout = {
    text = {}
  },
  stderr = {
    text = {}
  }
}


function cb_filter(tag, ts, record)
  local _, _, date, stream, record_type, line = record.log:find('^(%S+)%s+(%S+)%s+([PF])%s+(.+)$')
  
  table.insert(buffer[stream].text, line)
  if #(buffer[stream].text) == 1 then
    -- store the date extracted from the first line
    buffer[stream].date = date
  end
  
  if record_type == 'P' then
    -- not finished
    return -1       
  end
  
  -- flush the buffered message
  record = {
    date = buffer[stream].date,
    stream = stream,
    log = table.concat(buffer[stream].text, '\n')
  }
  buffer[stream].text={}
  
  return 1, ts, record
end

Test on fluent-bit lua playground. Sample input:

{"log":"2021-11-18T14:47:34.730832241+00:00 stdout F some short log message A"}
{"log":"2021-11-18T14:47:34.730832241+00:00 stdout P some longer log message B part one"}
{"log":"2021-11-18T14:47:34.730908438+00:00 stdout F end of the longer message B"}
{"log":"2021-11-18T14:47:34.730908438+00:00 stdout P another longer log message  C part one"}
{"log":"2021-11-18T14:47:34.730939216+00:00 stdout P message C continued in part two"}
{"log":"2021-11-18T14:47:34.735454549+00:00 stderr F some log message D in stderr part one"}
{"log":"2021-11-18T14:47:34.735482854+00:00 stdout F message C still continued after a line from a different stream"}
{"log":"2021-11-18T14:47:34.735482854+00:00 stderr F and the end of the stderr message D"}

Output:

[0,{"date":"2021-11-18T14:47:34.730832241+00:00","stream":"stdout","log":"some short log message A"}]
[0,{"date":"2021-11-18T14:47:34.730832241+00:00","stream":"stdout","log":"some longer log message B part one\nend of the longer message B"}]
[0,{"date":"2021-11-18T14:47:34.735454549+00:00","stream":"stderr","log":"some log message D in stderr part one"}]
[0,{"date":"2021-11-18T14:47:34.730908438+00:00","stream":"stdout","log":"another longer log message  C part one\nmessage C continued in part two\nmessage C still continued after a line from a different stream"}]
[0,{"date":"2021-11-18T14:47:34.735482854+00:00","stream":"stderr","log":"and the end of the stderr message D"}]

Note that this example does not consider multiple files. If grouping streams is required, then per file/tag buffers would need to be used

Sep 21 '22 16:09 tarruda

fluent-bit fluent-bit copied to clipboard

Multiline feature with built-in CRI parser does not separate streams

Bug Report

fluent-bit
fluent-bit copied to clipboard