fluent-bit
fluent-bit copied to clipboard
FluentBit move fractional seconds
Hello.
I am not sure, that this is bug or feature or problem somewhere else, but I discovered, that FluentBit moving with timestamp, see this examples:
Example #1:
Raw record (exact line from log file):
2022-09-07T11:52:55.505530838Z stdout F [2022-09-07 11:52:55] Working on ADDED configmap monitoring/prometheus-stack-kube-prom-grafana-overview
FluentBit processing (processed line on FluentBit output, CSV format):
1662551575.505530834,
"[2022-09-07 11:52:55] Working on ADDED configmap monitoring/prometheus-stack-kube-prom-grafana-overview",
"stdout","F",
{"labels"=>
{"pod-template-hash"=>"69f7c847f7",
"app_kubernetes_io/name"=>"grafana",
"app_kubernetes_io/instance"=>"prometheus-stack"},
"namespace_name"=>"monitoring",
"pod_id"=>"3947599e-6663-4619-aa25-3c779c39edae",
"container_name"=>"grafana-sc-dashboard",
"host"=>"ip-XX-XX-XX-XX.eu-central-1.compute.internal",
"container_hash"=>...common.cdn.repositories..../kiwigrid/k8s-sidecar@sha256:1f025ae37b7b20d63bffd179e5e6f972039dd53d9646388c0a8c456229c7bbcb,
"docker_id"=>"c1f7296f593eeff90a87d19c555490a285cca4cfd0c69dd060d789b229d3b7cd",
"container_image"=>"...common.cdn.repositories..../kiwigrid/k8s-sidecar:1.15.6",
"pod_name"=>"prometheus-stack-grafana-69f7c847f7-rk2bc"},
"2022-09-07T11:52:55.505530838Z"
Jump is 4 fractional seconds in the future.
Example #2:
Raw record:
2022-09-07T11:52:55.505561119Z stdout F [2022-09-07 11:52:55] Found 'data' on configmap
FluentBit processing:
1662551575.505561113,
"[2022-09-07 11:52:55] Found 'data' on configmap",
"stdout","F",
{"labels"=>
{"pod-template-hash"=>"69f7c847f7",
"app_kubernetes_io/name"=>"grafana",
"app_kubernetes_io/instance"=>"prometheus-stack"},
"namespace_name"=>"monitoring",
"pod_id"=>"3947599e-6663-4619-aa25-3c779c39edae",
"container_name"=>"grafana-sc-dashboard",
"host"=>"ip-XX-XX-XX-XX.eu-central-1.compute.internal",
"container_hash"=>...common.cdn.repositories..../kiwigrid/k8s-sidecar@sha256:1f025ae37b7b20d63bffd179e5e6f972039dd53d9646388c0a8c456229c7bbcb,
"docker_id"=>"c1f7296f593eeff90a87d19c555490a285cca4cfd0c69dd060d789b229d3b7cd",
"container_image"=>"...common.cdn.repositories..../kiwigrid/k8s-sidecar:1.15.6",
"pod_name"=>"prometheus-stack-grafana-69f7c847f7-rk2bc"},
"2022-09-07T11:52:55.505561119Z"
Jump is 6 fractional seconds in the future.
Example #3:
Raw record:
2022-09-07T11:52:55.505566066Z stdout F [2022-09-07 11:52:55] Writing grafana-overview.json
FluentBit processing:
1662551575.505566120,
"[2022-09-07 11:52:55] Writing grafana-overview.json",
"stdout","F",
{"labels"=>
{"pod-template-hash"=>"69f7c847f7",
"app_kubernetes_io/name"=>"grafana",
"app_kubernetes_io/instance"=>"prometheus-stack"},
"namespace_name"=>"monitoring",
"pod_id"=>"3947599e-6663-4619-aa25-3c779c39edae",
"container_name"=>"grafana-sc-dashboard",
"host"=>"ip-XX-XX-XX-XX.eu-central-1.compute.internal",
"container_hash"=>...common.cdn.repositories..../kiwigrid/k8s-sidecar@sha256:1f025ae37b7b20d63bffd179e5e6f972039dd53d9646388c0a8c456229c7bbcb,
"docker_id"=>"c1f7296f593eeff90a87d19c555490a285cca4cfd0c69dd060d789b229d3b7cd",
"container_image"=>"...common.cdn.repositories..../kiwigrid/k8s-sidecar:1.15.6",
"pod_name"=>"prometheus-stack-grafana-69f7c847f7-rk2bc"},
"2022-09-07T11:52:55.505566066Z"
Jump is 54 fractional seconds in the past.
Expected behavior FluentBit will not change timestamp fractional seconds.
Your Environment
- Version used: 1.9.5 / 1.9.7
- Configuration: docker image
fluent/fluent-biton containerd - Environment name and version (e.g. Kubernetes? What version?): Kubernetes 1.24.3
- Server type and version: AWS c5.xlarge (CPU: 4, Memory: 8GB)
- Operating System and version:
gardenlinux v: 576.11.0(Debian 11 derivate) - Filters:
- lua
- kubernetes
- Plugins:
- systemd
- tail
- opensearch
Additional context
When you have few log records per seconds, this probably not hit you, but when you have million and above log records from exact pods, output is unusable, you need fractional seconds scale. In this scenario records are mixed over time due to moved fractional seconds from (for me) unknown reason.
I wrote small script to generate marked lines and from output it seems, that ordering isnt correct:

Have we something set wrong on our side? If yes, can you please pointing me to the right direction? Is this really bug?
Thank you!
Are you parsing the timestamp from the raw record? If not it will be using the local time.
Hi @patrick-stephens . Thank you for feedback.
Yes, we have parser and as you can see from last field, FluentBit processing it correctly.
As you can see on example 1:
- in raw record is timestamp
2022-09-07T11:52:55.505530838Z - FluentBit read it correctly and correctly write it to the last field (
"2022-09-07T11:52:55.505530838Z") - BUT as
timestampsend to OpenSearch1662551575.505530834(2022-09-07T11:52:55.505530834Z)
Can you or somebody explain, why FluentBit sent timestamp
2022-09-07T11:52:55.505530834Z
instead of
2022-09-07T11:52:55.505530838Z
?
Why he not send, what really read? Why he modify it by from my perspective totally random pattern?
If you have few log records per seconds, this problem you probably not spot. But when you have large K8s cluster and components, which can generate millions of log records per second, fractional seconds is crucial for us. I mean, if FluentBit randomly change timestamp, log records are for debugging useless, because you dont know, which event is cause of which consequence ...
For example, in raw log records should be:
2022-09-07T12:10:00.000000001Z stdout F WARN ...
2022-09-07T12:10:00.000000002Z stdout F INFO ...
2022-09-07T12:10:00.000000003Z stdout F INFO ...
2022-09-07T12:10:00.000000004Z stdout F INFO ...
2022-09-07T12:10:00.000000005Z stdout F ERROR ...
but FluentBit can sent to OpenSearch:
2022-09-07T12:10:00.000000007Z stdout F ERROR ...
2022-09-07T12:10:00.000000009Z stdout F INFO ...
2022-09-07T12:10:00.000000013Z stdout F WARN ...
2022-09-07T12:10:00.000000024Z stdout F INFO ...
2022-09-07T12:10:00.000000033Z stdout F INFO ...
and now we are lost :(. We dont know, which events cause what consequences ...
What's your parser config? What field is it using for the timestamp and is it parsing the fractional seconds?
Hi @patrick-stephens .
I am not sure, that I understand you, so, maybe I am wrong.
Parser config:
[PARSER]
Name kube-tag
Format regex
Regex ^(?<pod_id>[^_/]+)\.(?<namespace_name>[^_/]+)\.(?<pod_name>[^_/]+)\.(?<container_name>[^/]+)$
Time_Format %Y-%m-%dT%H:%M:%S.%L%Z
We using @timestamp field.
Here is processed record from OpenSearch from my test cluster (FluentBit in this scenario feeding directly OpenSearch, no another component on the way):

If you need any additional information, please, let me know. My goal is understand, why this "jumps" occurring OR fix the problem.
@LHozzan
- Could you share your config file and lua script to clarify the pipeline ?
- How did you get CSV output ? using out_file plugin ?
@nokute78
Thank you for your feedback.
Could you share your config file and lua script to clarify the pipeline ?
Sure, see below:
fluent-bit.conf
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
storage.type filesystem
storage.path /var/fluent-bit/state/flb-storage/
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 50M
storage.max_chunks_up 1000
storage.metrics on
@INCLUDE _input-tail.conf
@INCLUDE _input-systemd.conf
@INCLUDE _filter-kubernetes.conf
@INCLUDE _filter-de_dot.conf
@INCLUDE filter-*.conf
@INCLUDE _output-opensearch-*.conf
@INCLUDE output-*.conf
_filter-de_dot.conf
[FILTER]
Name lua
Match kube.*
script /fluent-bit/etc/dedot.lua
call dedot
_filter-kubernetes.conf
[FILTER]
Name kubernetes
Match kube.*
Kube_Tag_Prefix kube.
Regex_Parser kube-tag
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log true
Keep_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On
Labels On
Kube_Meta_Cache_TTL 60s
Annotations Off
_input-systemd.conf
[INPUT]
Name systemd
Tag systemd.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=containerd.service
Systemd_Filter_Type Or
Read_From_Tail On
Path /run/log/journal
Strip_Underscores On
_input-tail.conf
[INPUT]
Name tail
Tag kube.<pod_id>.<namespace_name>.<pod_name>.<container_name>
Tag_Regex /var/log/pods/(?<namespace_name>[^_/]+)_(?<pod_name>[^_/]+)_(?<pod_id>[^_/]+)/(?<container_name>[^/]+)/.*
Path /var/log/pods/*/*/*.log
DB /var/fluent-bit/state/flb_kube.db
Exclude_Path /var/log/pods/*/*/*.gz,/var/log/pods/*/*/*.zip
Skip_Long_Lines Off
Buffer_Chunk_Size 32k
Buffer_Max_Size 128k
Refresh_Interval 30
Rotate_Wait 10
Read_from_Head false
multiline.parser cri, go, java
_output-opensearch-containers.conf
[OUTPUT]
Name opensearch
Match kube.*
Host ${FLUENT_opensearch_HOST}
Port ${FLUENT_opensearch_PORT}
HTTP_User ${FLUENT_opensearch_USER}
HTTP_Passwd ${FLUENT_opensearch_PASSWORD}
Logstash_Format On
Logstash_Prefix containers
Replace_Dots On
Retry_Limit 25
Trace_Error On
Suppress_Type_Name On
Include_Tag_Key Off
Time_Key_Nanos On
Generate_ID On
tls On
tls.Verify On
tls.ca_file /fluent-bit/ssl/root-ca.pem
tls.crt_file /fluent-bit/ssl/admin.pem
tls.key_file /fluent-bit/ssl/admin-key.pem
Buffer_Size 1024kb
_output-opensearch-journals.conf
[OUTPUT]
Name opensearch
Match systemd.*
Host ${FLUENT_opensearch_HOST}
Port ${FLUENT_opensearch_PORT}
HTTP_User ${FLUENT_opensearch_USER}
HTTP_Passwd ${FLUENT_opensearch_PASSWORD}
Logstash_Format On
Logstash_Prefix systemd
Replace_Dots On
Retry_Limit 25
Trace_Error On
Suppress_Type_Name On
Include_Tag_Key Off
Time_Key_Nanos On
Generate_ID On
tls On
tls.Verify On
tls.ca_file /fluent-bit/ssl/root-ca.pem
tls.crt_file /fluent-bit/ssl/admin.pem
tls.key_file /fluent-bit/ssl/admin-key.pem
dedot.lua
function dedot(tag, timestamp, record)
if record["kubernetes"] == nil then
return 0, 0, 0
end
dedot_keys(record["kubernetes"]["annotations"])
dedot_keys(record["kubernetes"]["labels"])
return 1, timestamp, record
end
function dedot_keys(map)
if map == nil then
return
end
local new_map = {}
local changed_keys = {}
for k, v in pairs(map) do
local dedotted = string.gsub(k, "%.", "_")
if dedotted ~= k then
new_map[dedotted] = v
changed_keys[k] = true
end
end
for k in pairs(changed_keys) do
map[k] = nil
end
for k, v in pairs(new_map) do
map[k] = v
end
end
filter-nginx.conf
[FILTER]
Name rewrite_tag
Match kube.*
Rule $kubernetes['labels']['app_kubernetes_io/name'] "^(ingress-nginx)$" nginx false
[FILTER]
Name parser
Match nginx
Key_Name log
Parser k8s-nginx-ingress
Reserve_Data True
[FILTER]
# Sanitizing garbage from penetration tests
Name modify
Match nginx
Condition Key_value_equals upstream_response_length -
Set upstream_status 999
Set upstream_response_time 0.009
Set upstream_response_length 9
output-nginx.conf
[OUTPUT]
Name opensearch
Match nginx
Host ${FLUENT_opensearch_HOST}
Port ${FLUENT_opensearch_PORT}
HTTP_User ${FLUENT_opensearch_USER}
HTTP_Passwd ${FLUENT_opensearch_PASSWORD}
Logstash_Format On
Logstash_Prefix nginx
Replace_Dots On
Retry_Limit False
Trace_Error On
Suppress_Type_Name On
Include_Tag_Key Off
Time_Key_Nanos Off
Generate_ID On
tls On
tls.Verify On
tls.ca_file /fluent-bit/ssl/root-ca.pem
tls.crt_file /fluent-bit/ssl/admin.pem
tls.key_file /fluent-bit/ssl/admin-key.pem
parsers.conf
[PARSER]
Name kube-tag
Format regex
Regex ^(?<pod_id>[^_/]+)\.(?<namespace_name>[^_/]+)\.(?<pod_name>[^_/]+)\.(?<container_name>[^/]+)$
Time_Format %Y-%m-%dT%H:%M:%S.%L%Z
[PARSER]
Name syslog
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
[PARSER]
Name k8s-nginx-ingress
Format regex
Regex ^(?<host>[^ ]*) - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referrer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<reg_id>[^ ]*).*$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
How did you get CSV output ? using out_file plugin ?
Yes, config below:
[OUTPUT]
Name file
Match *
Format csv
Mkdir true
Path /var/fluent-bit/state/rawdata
Meantime I have a little progress. In scenario, where only FluentBit and OpenSearch are deployed, looks much better and in OpenSearchDashboards is now "correctly ordered" records. Looks like:

Unfortunately, these jumps still occurred:

Hope that helps.
If you need another additional information, please, let me know.
@LHozzan Thank you for information I can reproduce it. Could you use code = 2 at lua script ? https://docs.fluentbit.io/manual/pipeline/filters/lua#return-values
If code equals 2, means the original timestamp is not modified and the record has been modified
I think data conversion lua <-> C causes this issue.
I tested following config.
a.conf:
[SERVICE]
parsers_file parsers.conf
[INPUT]
Name tail
Path a.log
Read_From_Head on
multiline.parser cri
[FILTER]
Name lua
Match *
script a.lua
call a
[OUTPUT]
Name stdout
Match *
a.lua:
function a(tag, timestamp, record)
return 1, timestamp, record
end
a.log:
2022-09-07T11:52:55.505530838Z stdout F [2022-09-07 11:52:55] Working on ADDED configmap monitoring/prometheus-stack-kube-prom-grafana-overview
Output is
[0] tail.0: [1662551575.505530834, {"_p"=>"F", "time"=>"2022-09-07T11:52:55.505530838Z", "log"=>"[2022-09-07 11:52:55] Working on ADDED configmap monitoring/prometheus-stack-kube-prom-grafana-overview", "stream"=>"stdout"}]
$ ../../bin/fluent-bit -c a.conf
Fluent Bit v2.0.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2022/10/05 20:47:09] [ info] [fluent bit] version=2.0.0, commit=8a4ded6f66, pid=16631
[2022/10/05 20:47:09] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2022/10/05 20:47:09] [ info] [cmetrics] version=0.4.0
[2022/10/05 20:47:09] [ info] [input:tail:tail.0] initializing
[2022/10/05 20:47:09] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2022/10/05 20:47:09] [ info] [input:tail:tail.0] multiline core started
[2022/10/05 20:47:09] [ info] [sp] stream processor started
[2022/10/05 20:47:09] [ info] [input:tail:tail.0] inotify_fs_add(): inode=4730770 watch_fd=1 name=a.log
[2022/10/05 20:47:09] [ info] [output:stdout:stdout.1] worker #0 started
[2022/10/05 20:47:09] [ info] [output:file:file.0] worker #0 started
[0] tail.0: [1662551575.505530834, {"_p"=>"F", "time"=>"2022-09-07T11:52:55.505530838Z", "log"=>"[2022-09-07 11:52:55] Working on ADDED configmap monitoring/prometheus-stack-kube-prom-grafana-overview", "stream"=>"stdout"}]
Hi @nokute78 .
Thank you for your effort I really appreciate it.
Just for clarify: Mean you this file
dedot.lua
function dedot(tag, timestamp, record)
if record["kubernetes"] == nil then
return 0, 0, 0
end
dedot_keys(record["kubernetes"]["annotations"])
dedot_keys(record["kubernetes"]["labels"])
return 1, timestamp, record
end
function dedot_keys(map)
if map == nil then
return
end
local new_map = {}
local changed_keys = {}
for k, v in pairs(map) do
local dedotted = string.gsub(k, "%.", "_")
if dedotted ~= k then
new_map[dedotted] = v
changed_keys[k] = true
end
end
for k in pairs(changed_keys) do
map[k] = nil
end
for k, v in pairs(new_map) do
map[k] = v
end
end
change to this state:
function dedot(tag, timestamp, record)
if record["kubernetes"] == nil then
return 0, 0, 0
end
dedot_keys(record["kubernetes"]["annotations"])
dedot_keys(record["kubernetes"]["labels"])
return 2, timestamp, record --[[ previous 1, now 2]]
end
function dedot_keys(map)
if map == nil then
return
end
local new_map = {}
local changed_keys = {}
for k, v in pairs(map) do
local dedotted = string.gsub(k, "%.", "_")
if dedotted ~= k then
new_map[dedotted] = v
changed_keys[k] = true
end
end
for k in pairs(changed_keys) do
map[k] = nil
end
for k, v in pairs(new_map) do
map[k] = v
end
end
?
Many thanks!
Hi @nokute78 .
You have right! If I change this Lua script and change this code as above, FluentBit not modify @timestamp and let exact value, like in time field.

Just for my curious: Is this problem in our script and we dont understand documentation or is this bug in FluentBit and can occur anytime in future, when somebody attempt to use Lua script to modify his workflow and timestamps?
Thank you very much for assistance with debugging, I really appreciate it!
@LHozzan Thank you for testing. I'm glad to hear that.
Is this problem in our script and we dont understand documentation or is this bug in FluentBit and can occur anytime in future, when somebody attempt to use Lua script to modify his workflow and timestamps?
Current Lua API can cause rounding error and it cause this issue. I think it needs to change API, but it will cause breaking change.
Lua supports Number type and it is double-precision floating-point type. https://www.lua.org/pil/2.3.html
Fluent-bit supports flb_time and it is a wrapper of timespec. It represents a timestamp using integer types.
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
Floating-point types can cause rounding error. Following code can reproduce issue.
#include <stdio.h>
int main() {
int a = 1662551575;
int b = 505530838;
double val = (double)a + ((double)b/(double)1000000000);
printf("val=%.9lf\n", val);
return 0;
}
The same result in golang. https://go.dev/play/p/dV7o70r3ZRP
Thank you very much for explanation! I really appreciate it!
This solution is actually working for us and not bring any overheat or any limitation. Please, consider my problem as a solved.
Current Lua API can cause rounding error and it cause this issue. I think it needs to change API, but it will cause breaking change.
If you wish, you can close this issue, or let it opened for somebody, who will have similar problems :).
I hope, that this problem will disappear over time ...
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.