fluentd-docker-image icon indicating copy to clipboard operation
fluentd-docker-image copied to clipboard

After fluentd image update logs not pushing to elasticsearch

Open koushickp opened this issue 4 years ago • 1 comments

After upgrading the fluentd image to the latest( fluent/fluentd:v1.14-debian) the logs to elastic search has been stopped.

Further when we checked the logs of fluentd we observed it is using default configuration file and not taking the input from the config map which is already there on the cluster

Any idea why its not taking the configuration from the configmap

Logs

2021-10-26 04:17:52 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2021-10-26 04:17:52 +0000 [warn]: [output_docker1] 'time_format' specified without 'time_key', will be ignored
2021-10-26 04:17:52 +0000 [warn]: [output1] 'time_format' specified without 'time_key', will be ignored
2021-10-26 04:17:52 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type forward
    @id input1
    @label @mainstream
    port 24224
  </source>
  <filter **>
    @type stdout
  </filter>
  <label @mainstream>
    <match docker.**>
      @type file
      @id output_docker1
      path "/fluentd/log/docker.*.log"
      symlink_path "/fluentd/log/docker.log"
      append true
      time_slice_format %Y%m%d
      time_slice_wait 1m
      time_format %Y%m%dT%H%M%S%z
      <buffer time>
        timekey_wait 1m
        timekey 86400
        path /fluentd/log/docker.*.log
      </buffer>
      <inject>
        time_format %Y%m%dT%H%M%S%z
      </inject>
    </match>
    <match **>
      @type file
      @id output1
      path "/fluentd/log/data.*.log"
      symlink_path "/fluentd/log/data.log"
      append true
      time_slice_format %Y%m%d
      time_slice_wait 10m
      time_format %Y%m%dT%H%M%S%z
      <buffer time>
        timekey_wait 10m
        timekey 86400
        path /fluentd/log/data.*.log
      </buffer>
      <inject>
        time_format %Y%m%dT%H%M%S%z
      </inject>
    </match>
  </label>
</ROOT>
2021-10-26 04:17:52 +0000 [info]: starting fluentd-1.3.2 pid=8 ruby="2.5.2"
2021-10-26 04:17:52 +0000 [info]: spawn command to main:  cmdline=["/usr/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--under-supervisor"]
2021-10-26 04:17:53 +0000 [info]: gem 'fluentd' version '1.3.2'
2021-10-26 04:17:53 +0000 [info]: adding match in @mainstream pattern="docker.**" type="file"
2021-10-26 04:17:53 +0000 [warn]: #0 [output_docker1] 'time_format' specified without 'time_key', will be ignored
2021-10-26 04:17:53 +0000 [info]: adding match in @mainstream pattern="**" type="file"
2021-10-26 04:17:53 +0000 [warn]: #0 [output1] 'time_format' specified without 'time_key', will be ignored
2021-10-26 04:17:53 +0000 [info]: adding filter pattern="**" type="stdout"
2021-10-26 04:17:53 +0000 [info]: adding source type="forward"
2021-10-26 04:17:53 +0000 [info]: #0 starting fluentd worker pid=18 ppid=8 worker=0
2021-10-26 04:17:53 +0000 [info]: #0 [input1] listening port port=24224 bind="0.0.0.0"
2021-10-26 04:17:53 +0000 [info]: #0 fluentd worker is now running worker=0
2021-10-26 04:17:53.370950091 +0000 fluent.info: {"worker":0,"message":"fluentd worker is now running worker=0"}
2021-10-26 04:17:53 +0000 [warn]: #0 no patterns matched tag="fluent.info"

Actual ConfigMap of fluentd which was already present on the cluster

kubectl describe cm fluentd-es-config-v0.1.4 -n kube-system
Name:         fluentd-es-config-v0.1.4
Namespace:    kube-system
Labels:       addonmanager.kubernetes.io/mode=Reconcile
              argocd.argoproj.io/instance=common-dev
Annotations:  <none>

Data

forward.input.conf:

# Takes the messages sent over TCP
<source>
  @type forward
</source>
monitoring.conf:

# Prometheus Exporter Plugin
# input plugin that exports metrics
<source>
  @type prometheus
</source>

<source>
  @type monitor_agent
</source>

# input plugin that collects metrics from MonitorAgent
<source>
  @type prometheus_monitor
  <labels>
    host ${hostname}
  </labels>
</source>

# input plugin that collects metrics for output plugin
<source>
  @type prometheus_output_monitor
  <labels>
    host ${hostname}
  </labels>
</source>

# input plugin that collects metrics for in_tail plugin
<source>
  @type prometheus_tail_monitor
  <labels>
    host ${hostname}
  </labels>
</source>

output.conf:

<filter kubernetes.**>
  @type concat
  key log
  separator ""
  stream_identity_key tag
  multiline_start_regexp /^time=/
  flush_interval 5
  timeout_label @NORMAL
</filter>
<match **>
  @type relabel
  @label @NORMAL
</match>
<label @NORMAL>
    <match kubernetes.**>
      @type rewrite_tag_filter
      <rule>
        key log
        pattern (tag=(AUDIT_LOG|CUSTOMER_AUDIT_LOG)|"log_type":"AUDIT_LOG")
        tag auditlog.${tag}
      </rule>
      <rule>
        key log
        pattern tag=PARTNER_AUDIT_LOG
        tag partner-auditlog.${tag}
      </rule>
      <rule>
        key log
        pattern tag=MAINTENANCE_AUDIT_LOG
        tag maintenance-auditlog.${tag}
      </rule>
      <rule>
        key log
        pattern ^time=".*?".*
        tag daas_service.${tag}
      </rule>
      <rule>
        key log
        pattern ^time=".*?".*
        tag other_service.${tag}
        invert true
      </rule>
    </match>
    # Enriches records with Kubernetes metadata
    <filter {daas_service,other_service}.kubernetes.**>
      @type kubernetes_metadata
    </filter>
    <filter {daas_service,other_service}.kubernetes.**>
      @type throttle
      group_key kubernetes.pod_name
      group_bucket_period_s   60
      group_bucket_limit      6000
    </filter>
    <filter daas_service.kubernetes.**>
      @type kvp
      parse_key log
      fields_key log_field
      pattern "([a-zA-Z_-]\\w*)=((['\"])(?:^(?:\\3)|[^\\\\]|\\\\.)*?(\\3)|[\\w.@$%/+-]*)"
    </filter>
    <filter daas_service.kubernetes.**>
      @type record_modifier
      <record>
        dummy ${if record.has_key?('log_field') and record['log_field'].has_key?('time'); record['@timestamp']=record['log_field']['time']; record['log_field'].delete('time'); end; nil}
        dummy2 ${begin; t = Time.parse record['@timestamp']; record['@timestamp'] = t.utc.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'); rescue; record.delete('@timestamp'); end; nil}
      </record>
      remove_keys dummy,dummy2
    </filter>
    <filter auditlog.kubernetes.**>
      @type kvp
      parse_key log
      pattern "([a-zA-Z_-]\\w*)=((['\"])(?:^(?:\\3)|[^\\\\]|\\\\.)*?(\\3)|[\\w.@$%/+-]*)"
    </filter>
    <filter auditlog.kubernetes.**>
      @type record_modifier
      <record>
        dummy ${if record.has_key?('time'); record['@timestamp']=record['time']; record.delete('time'); end; nil}
        dummy2 ${begin; t = Time.parse record['@timestamp']; record['@timestamp'] = t.utc.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'); rescue; record.delete('@timestamp'); end; nil}
        levelinfo ${if record.has_key?('level'); record['level']='info'; end; nil}
      </record>
      remove_keys dummy,dummy2,levelinfo
    </filter>
    <filter partner-auditlog.kubernetes.**>
      @type kvp
      parse_key log
      pattern "([a-zA-Z_-]\\w*)=((['\"])(?:^(?:\\3)|[^\\\\]|\\\\.)*?(\\3)|[\\w.@$%/+-]*)"
    </filter>
    <filter partner-auditlog.kubernetes.**>
      @type record_modifier
      <record>
        dummy ${if record.has_key?('time'); record['@timestamp']=record['time']; record.delete('time'); end; nil}
        dummy2 ${begin; t = Time.parse record['@timestamp']; record['@timestamp'] = t.utc.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'); rescue; record.delete('@timestamp'); end; nil}
        levelinfo ${if record.has_key?('level'); record['level']='info'; end; nil}
      </record>
      remove_keys dummy,dummy2,levelinfo
    </filter>
    <filter maintenance-auditlog.kubernetes.**>
      @type kvp
      parse_key log
      pattern "([a-zA-Z_-]\\w*)=((['\"])(?:^(?:\\3)|[^\\\\]|\\\\.)*?(\\3)|[\\w.@$%/+-]*)"
    </filter>
    <filter maintenance-auditlog.kubernetes.**>
      @type record_modifier
      <record>
        dummy ${if record.has_key?('time'); record['@timestamp']=record['time']; record.delete('time'); end; nil}
        dummy2 ${begin; t = Time.parse record['@timestamp']; record['@timestamp'] = t.utc.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'); rescue; record.delete('@timestamp'); end; nil}
        levelinfo ${if record.has_key?('level'); record['level']='info'; end; nil}
      </record>
      remove_keys dummy,dummy2,levelinfo
    </filter>
    <filter other_service.kubernetes.**_kube-system_**>
      @type grep
      <exclude>
        key log
        pattern /INFO/
      </exclude>
    </filter>
    <filter haproxy.**>
      @type parser
      key_name message
      reserve_data true
      reserve_time true
      emit_invalid_record_to_error false
      <parse>
        @type multi_format
        <pattern>
          format regexp
          # Examples
          # 10.2.1.0:31654 [06/Nov/2019:13:21:05.569] httpsfront default-paas-secure-443/10.20.48.136:443 1/0/642 3670 SD 3/2/0/0/0 0/0
          expression /^(?<remoteAddress>[\w\.]+:\d+) \[(?<requestDate>[^\]]*)\] httpsfront (?<namespace>[\w]+)-(?<service>[\w-]+)\/(?<backendAddress>[\w\.]+:\d+) (?<waitTime>\d+)\/(?<backendConnectTime>\d+)\/(?<responseTime>\d+) (?<responseBytes>\d+) (?<terminationState>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srvconn>\d+)\/(?<retries>\d+) (?<srvqueue>\d+)\/(?<backendQueue>\d+)$/
        </pattern>
        <pattern>
          format regexp
          expression /^(?<remoteAddress>[\w\.]+:\d+) \[(?<requestDate>[^\]]*)\] httpfront-(?<domain>[\w-.]+)~ (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)(-[\d]+)?\/[\w-]+ (?<requestReadTime>\d+)\/(?<waitTime>\d+)\/(?<backendConnectTime>\d+)\/(?<backendResponseTime>\d+)\/(?<responseTime>\d+) (?<statusCode>\d+) (?<responseBytes>\d+) (?<reqCookie>[\w-]+) (?<resCookie>[\w-]+) (?<terminationState>[\w-]+) (?<actconn>\d+)\/(?<feconn>\d+)\/(?<beconn>\d+)\/(?<srvconn>\d+)\/(?<retries>\d+) (?<srvqueue>\d+)\/(?<backendQueue>\d+) "(?<method>[A-Z]+) (?<url>[^ ]+) (?<httpVersion>[^ ]+)"$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Connect from 172.20.59.142:13201 to 172.20.59.142:31916 (httpfront/HTTP)
          # Connect from 10.0.1.2:33312 to 10.0.3.31:8012 (www/HTTP)
          expression /^Connect from (?<remoteAddress>[\w\.]+:\d+) to (?<backendAddress>[\w\.]+:\d+) \((?<frontend>[\w]+)\/(?<mode>[\w]+)\)$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Server kube-system-fluentd-http-http-input/server0002 is going DOWN for maintenance. 3 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          # Server kube-system-fluentd-http-http-input/server0001 is going DOWN for maintenance. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          expression /^Server (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)\/[\w-]+ is going DOWN for maintenance. (?<remainingActive>\d+) active and (?<remainingBackup>\d+) backup servers left. (?<activeSessions>\d+) sessions active, (?<requeued>\d+) requeued, (?<remainingInQueue>\d+) remaining in queue.$/
        </pattern>


        <pattern>
          format regexp
          # Examples:
          # "10.2.2.0:60889 [06/Nov/2019:13:54:54.904] httpfront-shared-frontend/3: SSL handshake failure"
          expression /^(?<remoteAddress>[\w\.]+:\d+) \[(?<requestDate>[^\]]*)\] (?<frontend>[\w-]+\/\d+): (?<msg>[\w].*)$/
        </pattern>
        <pattern>
          format regexp
          # Examples:
          # Server kube-system-fluentd-http-http-input/server0003 is DOWN, reason: Layer4 connection problem, info: \"Connection refused\", check duration: 0ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
          # Server kube-system-fluentd-http-http-input/server0003 is UP, reason: Layer4 check passed, check duration: 0ms. 3 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
          expression /^Server (?<namespace>kube-system|[\w]+)-(?<service>[\w-]+)\/[\w-]+ is (?<status>[\w]+), reason: (?<reason>[^,]+), (info: "(?<info>[^"]+)", )?check duration: (?<checkDuration>[^.]+). (?<remainingActive>\d+) active and (?<remainingBackup>\d+) backup servers (left|online). ((?<activeSessions>\d+) sessions active, )?(?<requeued>\d+) (sessions )?requeued, (?<remainingInQueue>\d+) (remaining|total) in queue.$/
        </pattern>
      </parse>
    </filter>
    <match auditlog.kubernetes.**>
      @id elasticsearch-auditlog
      @type elasticsearch
      @log_level info
      include_tag_key true
      host xxxxxxxxxxxxxxxxxxxxxxxx
      port 443
      user "#{ENV['ELASTIC_USER']}"
      password "#{ENV['ELASTIC_PASSWORD']}"
      scheme "https"
      ssl_verify false
      type_name _doc
      ssl_version TLSv1_2
      time_precision 3
      logstash_format true
      logstash_prefix auditlog
      reconnect_on_error true
      request_timeout 30s
      bulk_message_request_threshold -1
      <buffer>
        @type file
        path /var/log/fluentd-buffers/auditlog.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size 10M
        total_limit_size 10G
        queued_chunks_limit_size 100
        overflow_action throw_exception
      </buffer>
    </match>
    <match partner-auditlog.kubernetes.**>
      @id elasticsearch-partner-auditlog
      @type elasticsearch
      @log_level info
      include_tag_key true
      host xxxxxxxxxxxxx
      port 443
      user "#{ENV['ELASTIC_USER']}"
      password "#{ENV['ELASTIC_PASSWORD']}"
      scheme "https"
      ssl_verify false
      ssl_version TLSv1_2
      type_name _doc
      time_precision 3
      logstash_format true
      logstash_prefix partner-auditlog
      reconnect_on_error true
      request_timeout 30s
      bulk_message_request_threshold -1
      <buffer>
        @type file
        path /var/log/fluentd-buffers/partner-auditlog.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size 10M
        total_limit_size 10G
        queued_chunks_limit_size 100
        overflow_action throw_exception
      </buffer>
    </match>
    <match maintenance-auditlog.kubernetes.**>
      @id elasticsearch-maintenance-auditlog
      @type elasticsearch
      @log_level info
      include_tag_key true
      user "#{ENV['ELASTIC_USER']}"
      password "#{ENV['ELASTIC_PASSWORD']}"
      scheme "https"
      ssl_verify false
      ssl_version TLSv1_2
      type_name _doc
      host elastic.audit.dev-mgmt.srv.da.nsn-rdnet.net
      port 443
      time_precision 3
      logstash_format true
      logstash_prefix maintenance-auditlog
      reconnect_on_error true
      request_timeout 30s
      bulk_message_request_threshold -1
      <buffer>
        @type file
        path /var/log/fluentd-buffers/maintenance-auditlog.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size 10M
        total_limit_size 1G
        queued_chunks_limit_size 100
        overflow_action throw_exception
      </buffer>
    </match>
    <match haproxy.**>
      @id elasticsearch-haproxy
      @type elasticsearch
      @log_level info
      include_tag_key true
      user "#{ENV['ELASTIC_USER']}"
      password "#{ENV['ELASTIC_PASSWORD']}"
      scheme "https"
      type_name _doc
      ssl_verify false
      ssl_version TLSv1_2
      host da-eck-es-http.monitoring
      port 9200
      time_precision 3
      logstash_format true
      logstash_prefix haproxy
      reconnect_on_error true
      request_timeout 30s
      bulk_message_request_threshold -1
      <buffer>
        @type memory
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 20
        retry_max_times 10
        chunk_limit_size 10M
        total_limit_size 100M
        queued_chunks_limit_size 100
        overflow_action throw_exception
      </buffer>
    </match>
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      type_name _doc
      host da-eck-es-http.monitoring
      port 9200
      user "#{ENV['ELASTIC_USER']}"
      password "#{ENV['ELASTIC_PASSWORD']}"
      scheme "https"
      ssl_verify false
      ssl_version TLSv1_2
      time_precision 3
      logstash_format true
      reconnect_on_error true
      request_timeout 30s
      bulk_message_request_threshold -1
      <buffer>
        @type memory
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_max_interval 30
        chunk_limit_size 10M
        retry_max_times 10
        total_limit_size 1G
        queued_chunks_limit_size 100
        overflow_action throw_exception
      </buffer>
    </match>
</label>
system.conf:
<system>
  root_dir /tmp/fluentd-buffers/
</system>
system.input.conf:

/var/run/google.startup.script
<source>
  @id startupscript.log
  @type tail
  format syslog
  path /var/log/startupscript.log
  pos_file /var/log/es-startupscript.log.pos
  tag startupscript
</source>


<source>
  @id kubelet.log
  @type tail
  format multiline
  multiline_flush_interval 5s
  format_firstline /^\w\d{4}/
  format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
  time_format %m%d %H:%M:%S.%N
  path /var/log/kubelet.log
  pos_file /var/log/es-kubelet.log.pos
  tag kubelet
</source>

<source>
  @id journald-docker
  @type systemd
  matches [{ "_SYSTEMD_UNIT": "docker.service" }]
  <storage>
    @type local
    persistent true
  </storage>
  read_from_head true
  tag docker
</source>
<source>
 @id haproxy-ingress
 @type syslog
 port 5140
 bind 0.0.0.0
 tag haproxy
 <parse>
   @type syslog
   message_format rfc5424
   rfc5424_time_format %Y-%m-%dT%H:%M:%S%z
 </parse>
</source>

<source>
  @id journald-kubelet
  @type systemd
  matches [{ "_SYSTEMD_UNIT": "kubelet.service" }]
  <storage>
    @type local
    persistent true
  </storage>
  read_from_head true
  tag kubelet
</source>

containers.input.conf:

<source>
  @id fluentd-containers.log
  @type tail
  path /var/log/containers/*.log
  exclude_path /var/log/containers/fluentd*.log
  pos_file /var/log/es-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag kubernetes.*
  format json
  read_from_head true
</source>

Events: <none>

koushickp avatar Oct 26 '21 04:10 koushickp

Does it succeed to upgrade Fluentd image? It seems that the old version of Fluentd is used.

2021-10-26 04:17:52 +0000 [info]: starting fluentd-1.3.2 pid=8 ruby="2.5.2"
2021-10-26 04:17:52 +0000 [info]: spawn command to main:  cmdline=["/usr/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--under-supervisor"]
2021-10-26 04:17:53 +0000 [info]: gem 'fluentd' version '1.3.2'

kenhys avatar Nov 01 '21 08:11 kenhys