fluent-bit
fluent-bit copied to clipboard
[unsupported v1.8] Kubernetes logs are not parsed in json with fluentbit 1.8.5
Bug Report
Hi,
Fluent bit is not parsing the json log message generated in kubernetes pods, log fields showing with escaped slashes. The log message is in proper json when generated with in the pod, which is as below.
{ "@timestamp": "2022-07-05 08:12:13.4600", "log.level": "DEBUG", "log.logger": "Enrichment.Services.EnrichmentService", "message": "Get next claim for enrichment" }
When parsed by the fluent bit parser with json format, the log is not parsed as json.
"log": "{\"log\":\"{ [\\\"@timestamp\\\](file:///%22@timestamp/)": [\\\"2022-07-05](file:///%222022-07-05) 06:50:23.6642\\\", [\\\"log.level\\\](file:///%22log.level/)": [\\\"DEBUG\\\](file:///%22DEBUG/)", [\\\"log.logger\\\](file:///%22log.logger/)": [\\\"Enrichment.Services.EnrichmentService\\\](file:///%22Enrichment.Services.EnrichmentService/)", [\\\"message\\\](file:///%22message/)": [\\\"Get](file:///%22Get) next claim for enrichment\\\" }\\n\",\"stream\":\"stdout\",\"time\":\"2022-07-05T06:50:23.6644413Z\"}", "data": { "log": "{ \"@timestamp\": \"2022-07-05 06:50:23.6642\", \"log.level\": \"DEBUG\", \"log.logger\": \"Enrichment.Services.EnrichmentService\", \"message\": \"Get next claim for enrichment\" }", "stream": "stdout", "time": "2022-07-05T06:50:23.6644413Z" },
Testing the configuration by having decode_field_as and with out decoder.
Below is configuration for the values.yaml file
[SERVICE]
Flush 2
Daemon Off
Log_Level info
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Exclude_Path ${FLUENT_ELASTICSEARCH_EXCLUDE}
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 256MB
Skip_Long_Lines On
Refresh_Interval 10
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Merge_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On
[OUTPUT]
Name es
Match *
Host ${FLUENT_ELASTICSEARCH_HOST}
Port ${FLUENT_ELASTICSEARCH_PORT}
Logstash_Format On
Retry_Limit False
tls On
tls.verify Off
Replace_dots On
<!--- A clear and concise description of what the bug is. -->
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep Off
To Reproduce
- Rubular link if applicable:
- Example log message if applicable:
{"log":"YOUR LOG MESSAGE HERE","stream":"stdout","time":"2018-06-11T14:37:30.681701731Z"}
Screenshots
are you able to reproduce the problem with Fluent Bit v1.9.x ? note that 1.8.x is no longer maintained.
Hi @edsiper , Have not tested this with 1.9, but the problem is there in1.8.5, tested this across multiple kubernetes env. there is nothing in the fluent-bit logs to check why json parsing is not happening properly. As per docmentation json parsing was fixed long back. but seeing this issue now, can you tell me where to look for the error cause/ troubleshooting more on this cause i looked every where for any logs, did not find any useful info.
sorry, cannot help you with v1.8.5 it's pretty old, it was released in August 2021. At least you should upgrade to v1.8.15 and check if the issue persists.
@edsiper - we will try our cluster config on a local instance and upgrade one of the clusters to .15 as you've suggested. Will be back shortly with results.
@edsiper we have upgraded fluent bit to 1.8.15, still seeing the json parsing issue, can you please help with this issue
{
"_index": "logstash-2022.07.20",
"_type": "_doc",
"_id": "PMPBGYIBud9fXG-8c3xh",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2022-07-20T03:58:03.348Z",
"log": "{\"log\":\"{ [\\\"@timestamp\\\](file:///%22@timestamp/)": [\\\"2022-07-20](file:///%222022-07-20) 03:58:03.3477\\\", [\\\"log.level\\\](file:///%22log.level/)": [\\\"DEBUG\\\](file:///%22DEBUG/)", [\\\"log.logger\\\](file:///%22log.logger/)": [\\\"Enrichment.Services.EnrichmentService\\\](file:///%22Enrichment.Services.EnrichmentService/)", [\\\"message\\\](file:///%22message/)": [\\\"Get](file:///%22Get) next claim for enrichment\\\" }\\n\",\"stream\":\"stdout\",\"time\":\"2022-07-20T03:58:03.347958567Z\"}",
"data": {
"log": "{ \"@timestamp\": \"2022-07-20 03:58:03.3477\", \"log.level\": \"DEBUG\", \"log.logger\": \"Enrichment.Services.EnrichmentService\", \"message\": \"Get next claim for enrichment\" }",
"stream": "stdout",
"time": "2022-07-20T03:58:03.347958567Z"
},
I have updated the fluent bit in a different cluster to 1.9.5 and input json logs to be parsed, the result is still same, the out put log format is an invalid json.
Docker logs passed as input :
$ kubectl logs python-gunicorn-webapp-86849477d6-2qhnb
{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-20T14:55:32.512988Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path \n", "logger": "myapp"}
Logs on kibana:
"@timestamp": "2022-07-20T14:55:32.514Z",
"log": "{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-20T14:55:32.512988Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n",
"stream": "stdout",
"time": "2022-07-20T14:55:32.514187635Z",

@edsiper @patrick-stephens appreciate some help on this.
@nandikatlakishore I think you need to look at what your underlying container runtime log looks like - not what the app is producing in the pod. This is what FB will be trying to parse and you may have some strange configuration there messing with it. I suspect it's quoting your JSON and wrapping it in additional JSON.
Look at the actual container log files you are tailing and not the output of the pod. Once you have this you should be able to use dummy input to then generate a fully reproducible example here.
@patrick-stephens , below is log in container runtime log, they are escaped but its a valid json format, where as the std out log stream after processed by FB, the end result is an invalid json. You can look at the below container log.
[root@ip-xxx python-gunicorn-webapp]# tail -2 0.log
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:16.733977Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:16.734687718Z"}
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:18.506932Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:18.507187382Z"}
[root@ip-xxx python-gunicorn-webapp]# pwd
/var/log/pods/default_python-gunicorn-webapp-86849477d6-2qhnb_37510592-b3dd-4227-863a-113c734fecdd/python-gunicorn-webapp
[root@ip-10-0-102-99 python-gunicorn-webapp]#
I would try using the decoders to see if that will un-escape them: https://docs.fluentbit.io/manual/pipeline/parsers/decoders
I would also use the stdout output plugin to see what they look like to Fluent Bit in case it is something downstream in Elastic/Kibana messing with the logs: https://docs.fluentbit.io/manual/pipeline/outputs/standard-output
This will help you see what Fluent Bit actually is doing to the logs.
@patrick-stephens , have used decoders as mentioned in the documentation, still same result. Also stdout has the same output with misformed json,
Also stdout has the same output with misformed json
That's fine, my main point was get it correct with this otherwise it'll always be wrong in Kibana - but also a reproducible example with this is orders of magnitude easier to iterate/debug as you don't need to handle a complete stack, just a single container.
@nandikatlakishore weird, the example log file you show and the docs work perfectly for me, it seems to be providing valid JSON below I think:
$ docker run --rm -it -v $PWD/:/fluent-bit/etc/:ro fluent/fluent-bit
Fluent Bit v1.9.6
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2022/08/08 09:30:28] [ info] [fluent bit] version=1.9.6, commit=b30141ba90, pid=1
[2022/08/08 09:30:28] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/08/08 09:30:28] [ info] [cmetrics] version=0.3.5
[2022/08/08 09:30:28] [ info] [sp] stream processor started
[2022/08/08 09:30:28] [ info] [output:stdout:stdout.0] worker #0 started
[2022/08/08 09:30:28] [ info] [input:tail:tail.0] inotify_fs_add(): inode=100401660 watch_fd=1 name=/fluent-bit/etc/test.log
[0] tail.0: [1659951028.109776966, {"log"=>"{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:16.733977Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path
", "logger": "myapp"}
", "stream"=>"stdout", "time"=>"2022-07-29T14:47:16.734687718Z"}]
[1] tail.0: [1659951028.109787886, {"log"=>"{"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:18.506932Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path
", "logger": "myapp"}
", "stream"=>"stdout", "time"=>"2022-07-29T14:47:18.507187382Z"}]
With config:
$ cat fluent-bit.conf
[SERVICE]
Parsers_file parser.conf
[INPUT]
Name Tail
Read_from_head on
Path /fluent-bit/etc/test.log
[FILTER]
Name parser
Parser docker
Key_name log
Match *
[OUTPUT]
Name stdout
Match *
$ cat parser.conf
[PARSER]
Name docker
Format json
Decode_Field_As escaped log
$ cat test.log
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:16.733977Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:16.734687718Z"}
{"log":"{\"tags\": [\"role:web\", \"env:prod\"], \"timestamp\": \"2022-07-29T14:47:18.506932Z\", \"level\": \"INFO\", \"host\": \"python-gunicorn-webapp-86849477d6-2qhnb\", \"path\": \"/myapp.py\", \"message\": \"Request on /path \\n\", \"logger\": \"myapp\"}\n","stream":"stdout","time":"2022-07-29T14:47:18.507187382Z"}
The JSON is therefore:
"log": {"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:16.733977Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path
", "logger": "myapp"}
"
The newline character is interpreted by the shell here I think, \n is required in the actual JSON format I believe.
If I add a LUA filter to output the specific key then it looks ok, my only concern is the multiline:
$ docker run --rm -it -v $PWD/:/fluent-bit/etc/:ro fluent/fluent-bit
Fluent Bit v1.9.6
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2022/08/08 09:40:33] [ info] [fluent bit] version=1.9.6, commit=b30141ba90, pid=1
[2022/08/08 09:40:33] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/08/08 09:40:33] [ info] [cmetrics] version=0.3.5
[2022/08/08 09:40:33] [ info] [sp] stream processor started
[2022/08/08 09:40:33] [ info] [output:stdout:stdout.0] worker #0 started
lua: {"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:16.733977Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path
", "logger": "myapp"}
lua: {"tags": ["role:web", "env:prod"], "timestamp": "2022-07-29T14:47:18.506932Z", "level": "INFO", "host": "python-gunicorn-webapp-86849477d6-2qhnb", "path": "/myapp.py", "message": "Request on /path
", "logger": "myapp"}
fluent-bit.conf:
[SERVICE]
Parsers_file parser.conf
[INPUT]
Name Tail
Read_from_head on
Path /fluent-bit/etc/test.log
[FILTER]
Name parser
Parser docker
Key_name log
Match *
[FILTER]
Name lua
Match *
script test.lua
call cb_print
[OUTPUT]
Name stdout
Match *
test.lua:
function cb_print(tag, timestamp, record)
print("lua:", record['log'])
return 0, timestamp, record
end
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.
This issue was closed because it has been stalled for 5 days with no activity.