fluent-plugin-concat icon indicating copy to clipboard operation
fluent-plugin-concat copied to clipboard

Concat containerd/docker output in the same config

Open sonnyhcl opened this issue 4 years ago • 3 comments

Problem

Since kubernetes is deprecating docker log driver and using containerd instead. We need to support concat both containerd and docker in the same time to make sure upgrade kubernetes version seamlessly. I know readme has some example to concat for docker/containerd seperately. But when I use both, the log output is empty.

Steps to replicate

Provide example config and message

fluentd.conf

# This file collects and filters all Kubernetes container logs. Should rarely need to modify it.

# Do not directly collect fluentd's own logs to avoid infinite loops.
<match fluent.**>
  @type null
</match>

<source>
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  tag kubernetes.*
  read_from_head true
  refresh_interval 2
  rotate_wait 5
  <parse>
     @type multi_format
     <pattern>
       format regexp
       expression /^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[FP]) (?<log>.*)$/
       time_format %Y-%m-%dT%H:%M:%S.%NZ
       keep_time_key true
     </pattern>
     <pattern>
       format json
       time_key @timestamp
       time_format %Y-%m-%dT%H:%M:%S.%NZ
       keep_time_key true
     </pattern>
  </parse>
</source>

<filter kubernetes.**>
  @type kubernetes_metadata
  watch false
</filter>

# Exclude events from Geneva containers since they just seem to echo events from other containers
<filter kubernetes.var.log.containers.geneva**.log>
  @type grep
  <exclude>
    key log
    pattern .*
  </exclude>
</filter>

# Concat containerd partial log
# https://github.com/fluent/fluentd-kubernetes-daemonset/issues/412#issuecomment-636536767
<filter **>
  @id containerd_concat
  @type concat
  key log
  use_first_timestamp true
  partial_key logtag
  partial_value P
  separator ""
</filter>

# Concat log truncated by docker 16KB limit
<filter **>
  @id filter_concat
  @type concat
  key log
  use_first_timestamp true
  multiline_end_regexp /\n$/
  separator ""
</filter>

# Flatten fields nested within the 'log' field
<filter kubernetes.var.log.containers.**.log>
  @type parser
  format json
  key_name log
  reserve_data true
</filter>

# Flatten fields nested within the 'kubernetes' field and remove unnecessary fields
<filter kubernetes.var.log.containers.**.log>
  @type record_transformer
  enable_ruby
  <record>
    ContainerName ${record["kubernetes"]["container_name"]}
    NamespaceName ${record["kubernetes"]["namespace_name"]}
    PodName ${record["kubernetes"]["pod_name"]}
    Node ${record["kubernetes"]["host"]}
  </record>
  remove_keys docker,kubernetes,stream,log
</filter>

# Anything else goes to standard output
<match **>
  @type stdout
</match>

logger.yaml

kind: Deployment
apiVersion: apps/v1
metadata:
  name: logger
  labels:
    app: logger
spec:
  replicas: 1
  selector:
    matchLabels:
      app: logger
  template:
    metadata:
      labels:
        app: logger
    spec:
      containers:
        - name: logger
          image: ubuntu
          command:
            - /bin/sh
          args:
            - '-c'
            - while true; do echo {\"EventName\":\"EventA\",\"Msg\":\"$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)\"}; echo {\"EventName\":\"EventB\",\"Msg\":\"$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 100000 | head -n 1)\"}; sleep 5; done

Expected Behavior

Log should be parsed and concat seamlessly for both containerd/docker log format.

Your environment

  • OS version
  • paste result of fluentd --version or td-agent --version
  • plugin version
    • paste boot log of fluentd or td-agent
    • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock
root@aks-agentpool-12801864-vmss000000:/# uname -a
Linux aks-agentpool-12801864-vmss000000 5.4.0-1035-azure #36~18.04.1-Ubuntu SMP Wed Dec 16 23:49:28 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
root@aks-agentpool-12801864-vmss000000:/# td-agent --version
td-agent 4.1.0 fluentd 1.12.1 (e3effa337593618cbd7f0f4ef071766df1ec69a0)
root@aks-agentpool-12801864-vmss000000:/# td-agent-gem list

*** LOCAL GEMS ***

addressable (2.7.0)
async (1.28.3)
async-http (0.54.1)
async-io (1.30.1)
async-pool (0.3.3)
aws-eventstream (1.1.0)
aws-partitions (1.427.0)        
aws-sdk-core (3.112.0)
aws-sdk-kms (1.42.0)
aws-sdk-s3 (1.88.1)
aws-sdk-sqs (1.36.0)
aws-sigv4 (1.2.2)
benchmark (default: 0.1.0)      
bigdecimal (default: 2.0.0)     
bundler (2.2.11, default: 2.1.4)
cgi (default: 0.1.0)
concurrent-ruby (1.1.8, 1.1.7)  
console (1.10.1)
cool.io (1.7.1)
csv (default: 3.1.2)
date (default: 3.0.0)
delegate (default: 0.1.0)       
did_you_mean (default: 1.4.0)   
digest-crc (0.6.3)
domain_name (0.5.20190701)      
elasticsearch (7.10.0)
elasticsearch-api (7.10.0)      
elasticsearch-transport (7.10.0)
etc (default: 1.1.0)
excon (0.78.1)
faraday (1.3.0)
faraday-net_http (1.0.0)
fcntl (default: 1.0.0)
ffi (1.14.2)
ffi-compiler (1.0.1)
fiber-local (1.0.0)
fiddle (default: 1.0.0)
fileutils (1.5.0, default: 1.4.1)
fluent-config-regexp-type (1.0.0)
fluent-diagtool (1.0.1)
fluent-logger (0.9.0)
fluent-plugin-azureeventhubs (0.0.7)       
fluent-plugin-colomanager-heartbeat (0.1.0)
fluent-plugin-concat (2.4.0)
fluent-plugin-elasticsearch (4.3.3)        
fluent-plugin-flatten-hash (0.5.1)
fluent-plugin-flowcounter-simple (0.1.0)
fluent-plugin-hanarp-message (0.1.0)
fluent-plugin-json-transform (0.0.1)
fluent-plugin-kafka (0.16.0)
fluent-plugin-kubernetes_metadata_filter (2.5.3)
fluent-plugin-mdm (0.1.0)
fluent-plugin-mdsd (0.1.9.pre.build.dev)
fluent-plugin-multi-format-parser (1.0.0)
fluent-plugin-process-redfishalert (0.1.0)
fluent-plugin-process-snmptrap (0.1.0)
fluent-plugin-process-ucs-syslog (0.1.0)
fluent-plugin-prometheus (1.8.5)
fluent-plugin-prometheus_pushgateway (0.0.2)
fluent-plugin-record-modifier (2.1.0)
fluent-plugin-rewrite-tag-filter (2.3.0)
fluent-plugin-route (1.0.0)
fluent-plugin-s3 (1.5.1)
fluent-plugin-sd-dns (0.1.0)
fluent-plugin-servicebus-queue (0.1.0)
fluent-plugin-snmptrapalert (0.1.0)
fluent-plugin-systemd (1.0.2, 0.3.1)
fluent-plugin-td (1.1.0)
fluent-plugin-throttle (0.0.3)
fluent-plugin-webhdfs (1.4.0)
fluentd (1.12.1, 1.11.5, 0.12.43)
forwardable (default: 1.3.1)
getoptlong (default: 0.1.0)
hirb (0.7.3)
http (4.4.1)
http-accept (1.7.0)
http-cookie (1.0.3)
http-form_data (2.3.0)
http-parser (1.2.3)
http_parser.rb (0.6.0)
httpclient (2.8.2.4)
io-console (default: 0.5.6)
ipaddr (default: 1.2.2)
irb (default: 1.2.6)
jmespath (1.4.0)
json (2.5.1, default: 2.3.0)
jsonpath (1.1.0)
kubeclient (4.9.1)
logger (default: 1.4.2)
lru_redux (1.1.0)
ltsv (0.1.2)
matrix (default: 0.2.0)
mime-types (3.3.1)
mime-types-data (3.2021.0212)
mini_portile2 (2.5.0)
minitest (5.13.0)
msgpack (1.4.2)
multi_json (1.15.0)
multipart-post (2.1.1)
prometheus-client (0.9.0)
protocol-hpack (1.4.2)
protocol-http (0.21.0)
protocol-http1 (0.13.2)
protocol-http2 (0.14.2)
pstore (default: 0.1.0)
psych (default: 3.1.0)
public_suffix (4.0.6)
quantile (0.2.1)
racc (1.5.2, default: 1.4.16)
rake (13.0.3, 13.0.1)
rdkafka (0.8.1)
rdoc (default: 6.2.1)
readline (default: 0.0.2)
recursive-open-struct (1.1.3)
reline (default: 0.1.5)
rest-client (2.1.0)
rexml (default: 3.2.3)
rss (default: 0.2.8)
ruby-kafka (1.3.0)
ruby-progressbar (1.11.0)
ruby2_keywords (0.0.2)
rubyzip (1.3.0)
sdbm (default: 1.0.0)
serverengine (2.2.3)
sigdump (0.2.4)
singleton (default: 0.1.0)
snmp (1.2.0)
string-scrub (0.0.5)
stringio (default: 0.1.0)
strptime (0.2.5)
strscan (default: 1.0.3)
systemd-journal (1.3.3)
td (0.16.9)
td-client (1.0.7)
td-logger (0.3.27)
test-unit (3.3.4)
timeout (default: 0.1.0)
timers (4.3.2)
tracer (default: 0.1.0)
tzinfo (2.0.4)
tzinfo-data (1.2021.1)
unf (0.1.4)
unf_ext (0.0.7.7)
uri (default: 0.10.0)
webhdfs (0.9.0)
webrick (1.7.0, default: 1.6.0)
xmlrpc (0.3.0)
yajl-ruby (1.4.1)
yaml (default: 0.1.0)
zip-zip (0.3)
zlib (default: 1.1.0)

sonnyhcl avatar Mar 02 '21 05:03 sonnyhcl

I guess that <filter **> cause such a result because ** applies both of them. It may be better to use the exact match for docker log driver or containerd separately.

<filter **>
  @id containerd_concat
  @type concat
...
</filter>

<filter **>
  @id filter_concat
  @type concat
...
</filter>

kenhys avatar Jun 01 '21 04:06 kenhys

@kenhys In my case, containerd_concat and filter_concat capture log from same workload group, but in different version kubernetes cluster. So I can't diff them with exact match label.

sonnyhcl avatar Jun 01 '21 05:06 sonnyhcl

Then, how about using rewrite tag filter plugin?, it can be distinguished by the timestamp and time key.

<match sample>
  @type rewrite_tag_filter
  <rule>
    key timestamp
    pattern ...
    tag docker.${tag}
  </rule>
  <rule>
    key time
    pattern ...
    tag containerd.${tag}
  </rule>
</match>

kenhys avatar Jun 18 '21 05:06 kenhys