fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

[unsupported v1.8] Kubernetes logs are not parsed in json with fluentbit 1.8.5

Open nandikatlakishore opened this issue 3 years ago • 14 comments

Bug Report

Hi,

Fluent bit is not parsing the json log message generated in kubernetes pods, log fields showing with escaped slashes. The log message is in proper json when generated with in the pod, which is as below.

{ "@timestamp": "2022-07-05 08:12:13.4600", "log.level": "DEBUG", "log.logger": "Enrichment.Services.EnrichmentService", "message": "Get next claim for enrichment" }

When parsed by the fluent bit parser with json format, the log is not parsed as json.

"log": "{\"log\":\"{ [\\\"@timestamp\\\](file:///%22@timestamp/)": [\\\"2022-07-05](file:///%222022-07-05) 06:50:23.6642\\\", [\\\"log.level\\\](file:///%22log.level/)": [\\\"DEBUG\\\](file:///%22DEBUG/)", [\\\"log.logger\\\](file:///%22log.logger/)": [\\\"Enrichment.Services.EnrichmentService\\\](file:///%22Enrichment.Services.EnrichmentService/)", [\\\"message\\\](file:///%22message/)": [\\\"Get](file:///%22Get) next claim for enrichment\\\" }\\n\",\"stream\":\"stdout\",\"time\":\"2022-07-05T06:50:23.6644413Z\"}", "data": { "log": "{ \"@timestamp\": \"2022-07-05 06:50:23.6642\", \"log.level\": \"DEBUG\", \"log.logger\": \"Enrichment.Services.EnrichmentService\", \"message\": \"Get next claim for enrichment\" }", "stream": "stdout", "time": "2022-07-05T06:50:23.6644413Z" },
Testing the configuration by having decode_field_as and with out decoder.
Below is configuration for the values.yaml file
[SERVICE]
    Flush          2
    Daemon         Off
    Log_Level      info
    Parsers_File   parsers.conf
    HTTP_Server On
    HTTP_Listen 0.0.0.0
    HTTP_Port 2020

[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/*.log
    Exclude_Path      ${FLUENT_ELASTICSEARCH_EXCLUDE}
    Parser            docker
    DB                /var/log/flb_kube.db
    Mem_Buf_Limit     256MB
    Skip_Long_Lines On
    Refresh_Interval 10

[FILTER]
    Name           kubernetes
    Match          kube.*
    Kube_URL       https://kubernetes.default.svc:443
    Merge_Log On
    K8S-Logging.Parser On
   K8S-Logging.Exclude On

[OUTPUT]
    Name   es
    Match  *
    Host   ${FLUENT_ELASTICSEARCH_HOST}
    Port   ${FLUENT_ELASTICSEARCH_PORT}
    Logstash_Format On
    Retry_Limit False
    tls On
    tls.verify Off
    Replace_dots On
<!--- A clear and concise description of what the bug is. -->

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   Off

To Reproduce

  • Rubular link if applicable:
  • Example log message if applicable:
{"log":"YOUR LOG MESSAGE HERE","stream":"stdout","time":"2018-06-11T14:37:30.681701731Z"}

Screenshots

nandikatlakishore avatar Jul 05 '22 08:07 nandikatlakishore

are you able to reproduce the problem with Fluent Bit v1.9.x ? note that 1.8.x is no longer maintained.

edsiper avatar Jul 06 '22 02:07 edsiper

Hi @edsiper , Have not tested this with 1.9, but the problem is there in1.8.5, tested this across multiple kubernetes env. there is nothing in the fluent-bit logs to check why json parsing is not happening properly. As per docmentation json parsing was fixed long back. but seeing this issue now, can you tell me where to look for the error cause/ troubleshooting more on this cause i looked every where for any logs, did not find any useful info.

nandikatlakishore avatar Jul 06 '22 09:07 nandikatlakishore

sorry, cannot help you with v1.8.5 it's pretty old, it was released in August 2021. At least you should upgrade to v1.8.15 and check if the issue persists.

edsiper avatar Jul 06 '22 14:07 edsiper

@edsiper - we will try our cluster config on a local instance and upgrade one of the clusters to .15 as you've suggested. Will be back shortly with results.

vinyar avatar Jul 09 '22 23:07 vinyar

@edsiper we have upgraded fluent bit to 1.8.15, still seeing the json parsing issue, can you please help with this issue

{
  "_index": "logstash-2022.07.20",
  "_type": "_doc",
  "_id": "PMPBGYIBud9fXG-8c3xh",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2022-07-20T03:58:03.348Z",
    "log": "{\"log\":\"{ [\\\"@timestamp\\\](file:///%22@timestamp/)": [\\\"2022-07-20](file:///%222022-07-20) 03:58:03.3477\\\", [\\\"log.level\\\](file:///%22log.level/)": [\\\"DEBUG\\\](file:///%22DEBUG/)", [\\\"log.logger\\\](file:///%22log.logger/)": [\\\"Enrichment.Services.EnrichmentService\\\](file:///%22Enrichment.Services.EnrichmentService/)", [\\\"message\\\](file:///%22message/)": [\\\"Get](file:///%22Get) next claim for enrichment\\\" }\\n\",\"stream\":\"stdout\",\"time\":\"2022-07-20T03:58:03.347958567Z\"}",
    "data": {
      "log": "{ \"@timestamp\": \"2022-07-20 03:58:03.3477\", \"log.level\": \"DEBUG\", \"log.logger\": \"Enrichment.Services.EnrichmentService\", \"message\": \"Get next claim for enrichment\" }",
      "stream": "stdout",
      "time": "2022-07-20T03:58:03.347958567Z"
    },

nandikatlakishore avatar Jul 20 '22 10:07 nandikatlakishore

I have updated the fluent bit in a different cluster to 1.9.5 and input json logs to be parsed, the result is still same, the out put log format is an invalid json.

Docker logs passed as input :

$ kubectl  logs python-gunicorn-webapp-86849477d6-2qhnb
{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-20T14:55:32.512988Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path \n", "logger": "myapp"}

Logs on kibana:

 "@timestamp": "2022-07-20T14:55:32.514Z",
    "log": "{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-20T14:55:32.512988Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n",
    "stream": "stdout",
    "time": "2022-07-20T14:55:32.514187635Z",

image

nandikatlakishore avatar Jul 20 '22 15:07 nandikatlakishore

@edsiper @patrick-stephens appreciate some help on this.

nandikatlakishore avatar Jul 26 '22 10:07 nandikatlakishore

@nandikatlakishore I think you need to look at what your underlying container runtime log looks like - not what the app is producing in the pod. This is what FB will be trying to parse and you may have some strange configuration there messing with it. I suspect it's quoting your JSON and wrapping it in additional JSON.

Look at the actual container log files you are tailing and not the output of the pod. Once you have this you should be able to use dummy input to then generate a fully reproducible example here.

patrick-stephens avatar Jul 29 '22 12:07 patrick-stephens

@patrick-stephens , below is log in container runtime log, they are escaped but its a valid json format, where as the std out log stream after processed by FB, the end result is an invalid json. You can look at the below container log.

[root@ip-xxx python-gunicorn-webapp]# tail -2 0.log
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:16.733977Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:16.734687718Z"}
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:18.506932Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:18.507187382Z"}
[root@ip-xxx python-gunicorn-webapp]# pwd
/var/log/pods/default_python-gunicorn-webapp-86849477d6-2qhnb_37510592-b3dd-4227-863a-113c734fecdd/python-gunicorn-webapp
[root@ip-10-0-102-99 python-gunicorn-webapp]#

nandikatlakishore avatar Jul 29 '22 16:07 nandikatlakishore

I would try using the decoders to see if that will un-escape them: https://docs.fluentbit.io/manual/pipeline/parsers/decoders

I would also use the stdout output plugin to see what they look like to Fluent Bit in case it is something downstream in Elastic/Kibana messing with the logs: https://docs.fluentbit.io/manual/pipeline/outputs/standard-output This will help you see what Fluent Bit actually is doing to the logs.

patrick-stephens avatar Jul 29 '22 16:07 patrick-stephens

@patrick-stephens , have used decoders as mentioned in the documentation, still same result. Also stdout has the same output with misformed json,

nandikatlakishore avatar Aug 08 '22 09:08 nandikatlakishore

Also stdout has the same output with misformed json

That's fine, my main point was get it correct with this otherwise it'll always be wrong in Kibana - but also a reproducible example with this is orders of magnitude easier to iterate/debug as you don't need to handle a complete stack, just a single container.

patrick-stephens avatar Aug 08 '22 09:08 patrick-stephens

@nandikatlakishore weird, the example log file you show and the docs work perfectly for me, it seems to be providing valid JSON below I think:

$ docker run --rm -it -v $PWD/:/fluent-bit/etc/:ro fluent/fluent-bit
Fluent Bit v1.9.6
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/08/08 09:30:28] [ info] [fluent bit] version=1.9.6, commit=b30141ba90, pid=1
[2022/08/08 09:30:28] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/08/08 09:30:28] [ info] [cmetrics] version=0.3.5
[2022/08/08 09:30:28] [ info] [sp] stream processor started
[2022/08/08 09:30:28] [ info] [output:stdout:stdout.0] worker #0 started
[2022/08/08 09:30:28] [ info] [input:tail:tail.0] inotify_fs_add(): inode=100401660 watch_fd=1 name=/fluent-bit/etc/test.log
[0] tail.0: [1659951028.109776966, {"log"=>"{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:16.733977Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path 
", "logger": "myapp"}
", "stream"=>"stdout", "time"=>"2022-07-29T14:47:16.734687718Z"}]
[1] tail.0: [1659951028.109787886, {"log"=>"{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:18.506932Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path 
", "logger": "myapp"}
", "stream"=>"stdout", "time"=>"2022-07-29T14:47:18.507187382Z"}]

With config:

$ cat fluent-bit.conf 
[SERVICE]
  Parsers_file parser.conf

[INPUT]
  Name Tail
  Read_from_head on
  Path /fluent-bit/etc/test.log

[FILTER]
  Name parser
  Parser docker
  Key_name log
  Match *

[OUTPUT]
  Name stdout
  Match *

$ cat parser.conf 
[PARSER]
    Name         docker
    Format       json
    Decode_Field_As    escaped     log

$ cat test.log 
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:16.733977Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:16.734687718Z"}
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:18.506932Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:18.507187382Z"}

The JSON is therefore:

"log": {"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:16.733977Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path 
", "logger": "myapp"}
"

The newline character is interpreted by the shell here I think, \n is required in the actual JSON format I believe.

patrick-stephens avatar Aug 08 '22 09:08 patrick-stephens

If I add a LUA filter to output the specific key then it looks ok, my only concern is the multiline:

$ docker run --rm -it -v $PWD/:/fluent-bit/etc/:ro fluent/fluent-bit
Fluent Bit v1.9.6
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/08/08 09:40:33] [ info] [fluent bit] version=1.9.6, commit=b30141ba90, pid=1
[2022/08/08 09:40:33] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/08/08 09:40:33] [ info] [cmetrics] version=0.3.5
[2022/08/08 09:40:33] [ info] [sp] stream processor started
[2022/08/08 09:40:33] [ info] [output:stdout:stdout.0] worker #0 started
lua:	{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:16.733977Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path 
", "logger": "myapp"}

lua:	{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:18.506932Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path 
", "logger": "myapp"}

fluent-bit.conf:

[SERVICE]
  Parsers_file parser.conf

[INPUT]
  Name Tail
  Read_from_head on
  Path /fluent-bit/etc/test.log

[FILTER]
  Name parser
  Parser docker
  Key_name log
  Match *
  
[FILTER]
  Name lua
  Match *
  script test.lua
  call cb_print

[OUTPUT]
  Name stdout
  Match *

test.lua:

function cb_print(tag, timestamp, record)
    print("lua:", record['log'])
    return 0, timestamp, record
end

patrick-stephens avatar Aug 08 '22 09:08 patrick-stephens

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Nov 07 '22 02:11 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Nov 13 '22 02:11 github-actions[bot]