fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

out_pgsql: add a configurable value called "daemon" for out pgsql plugin

Open TomlinfreeGit opened this issue 2 years ago • 12 comments

update out_pgsql plugin: add a configurable parameter to support run this plugin in a daemon mode. if in daemon mode, configuration error of out_pgsql will not cause fluentbit crash or exit.

N/A

Enter [N/A] in the box, if an item is not applicable to your change.

Testing Before we can approve your change; please submit the following in a comment:

  • [x] Example configuration file for the change
  • [x] Debug log output from testing the change
  • [x] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [x] Run local packaging test showing all targets (including any new ones) build.
  • [ ] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [x] Documentation required for this feature

Backporting

  • [x] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

TomlinfreeGit avatar Apr 18 '23 06:04 TomlinfreeGit

example config file for running out_pgsql in daemon mode:

[SERVICE]
    Flush  5
    Daemon  False
    Log_Level  error

[INPUT]
    name  cpu
    tag  cpu.local

[OUTPUT]
    name  pgsql
    host  192.168.142.132
    port  15432
    user  postgres
    password  yourpassword
    match  *
    database  agent
    table  test
    Daemon  True

[OUTPUT]
    name  stdout
    match  *

running result and valgrind result:

image image


example config file for running out_pgsql not in daemon mode:

[SERVICE]
    Flush  5
    Daemon  False
    Log_Level  error

[INPUT]
    name  cpu
    tag  cpu.local

[OUTPUT]
    name  pgsql
    host  192.168.142.132
    port  15432
    user  postgres
    password  yourpassword
    match  *
    database  agent
    table  test
    Daemon  False

[OUTPUT]
    name  stdout
    match  *

running result and valgrind result:

image image

TomlinfreeGit avatar Apr 18 '23 06:04 TomlinfreeGit

related document Merge Request https://github.com/fluent/fluent-bit-docs/pull/1081

TomlinfreeGit avatar Apr 18 '23 06:04 TomlinfreeGit

@sxd pls take a look

edsiper avatar Apr 25 '23 09:04 edsiper

@edsiper will take a look later today

@TomlinfreeGit the DCO test it's failing, can you please fix that in the meantime?

Best Regards!

sxd avatar Apr 25 '23 09:04 sxd

@TomlinfreeGit so, the idea it's the just restart or make the error not a crash here right? what if the user don't see this error ? how they will suppose to know that there's an error in the configuration ?

sxd avatar Apr 25 '23 19:04 sxd

@TomlinfreeGit so, the idea it's the just restart or make the error not a crash here right? what if the user don't see this error ? how they will suppose to know that there's an error in the configuration ?

Yes, for other out_plugins, error configuration will not cause fluent-bit crash, just log error when failed to flush data, and can be noticed by health_check; However for out_pgsql, error configuration will cause fluent-bit crash directly.

  1. the purpose for this PR is to make it possible out_pgsql deal with error configuration the same way as other out_plugins.
  2. there will be error log both for init and flush process for out_pgsql configuration errors, if user don't see the logs, they can also notice this by health_check. image image

TomlinfreeGit avatar Apr 26 '23 02:04 TomlinfreeGit

@TomlinfreeGit Testing this after a couple of minutes, about 10 minutes, I don't see any error in the logs saying that something needs to be fixed, the plugin just finished, doesn't look like a desire behavior since everything will work and will start, but if I can't flush the logs to the desired database this may be an issue.

This it's what I can see in the logs:

Fluent Bit v2.1.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/04/26 12:50:03] [error] [output:pgsql:pgsql.0] failed connecting to host=192.168.2.69 with error: connection to server at "192.168.2.69", port 5432 failed: No route to host
	Is the server running on that host and accepting TCP/IP connections?

And that's it, now on the API I'm getting this:

curl -i http://127.0.0.1:2020/api/v2/health
HTTP/1.1 200 OK
Server: Monkey/1.7.0
Date: Wed, 26 Apr 2023 10:53:51 GMT
Transfer-Encoding: chunked
Content-Type: application/json

{"fluent-bit":{"version":"2.1.0","edition":"Community","flags":["FLB_HAVE_IN_STORAGE_BACKLOG","FLB_HAVE_CHUNK_TRACE","FLB_HAVE_PARSER","FLB_HAVE_RECORD_ACCESSOR","FLB_HAVE_STREAM_PROCESSOR","FLB_HAVE_TLS","FLB_HAVE_OPENSSL","FLB_HAVE_METRICS","FLB_HAVE_WASM","FLB_HAVE_AWS","FLB_HAVE_AWS_CREDENTIAL_PROCESS","FLB_HAVE_SIGNV4","FLB_HAVE_SQLDB","FLB_HAVE_METRICS","FLB_HAVE_HTTP_SERVER","FLB_HAVE_SYSTEMD","FLB_HAVE_VALGRIND","FLB_HAVE_FORK","FLB_HAVE_TIMESPEC_GET","FLB_HAVE_GMTOFF","FLB_HAVE_UNIX_SOCKET","FLB_HAVE_ATTRIBUTE_ALLOC_SIZE","FLB_HAVE_PROXY_GO","FLB_HAVE_LIBBACKTRACE","FLB_HAVE_REGEX","FLB_HAVE_UTF8_ENCODER","FLB_HAVE_LUAJIT","FLB_HAVE_C_TLS","FLB_HAVE_ACCEPT4","FLB_HAVE_INOTIFY","FLB_HAVE_GETENTROPY","FLB_HAVE_GETENTROPY_SYS_RANDOM"]}}

So, how someone will notice that there's an error if everything start without problem? in my opinion a daemon should fail if there's anything that may cause an error, just like inside a pod/container, it will be restarted until something it's fixed, this it's more like making an error kind of invisible.

On the other hand, which plugins in Fluent-Bit behaves like what you want to add? Can you elaborate more on why we should implement this? I may be wrong so I want to read more

Thanks in advance!

sxd avatar Apr 26 '23 10:04 sxd

@TomlinfreeGit Testing this after a couple of minutes, about 10 minutes, I don't see any error in the logs saying that something needs to be fixed, the plugin just finished, doesn't look like a desire behavior since everything will work and will start, but if I can't flush the logs to the desired database this may be an issue.

This it's what I can see in the logs:

Fluent Bit v2.1.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/04/26 12:50:03] [error] [output:pgsql:pgsql.0] failed connecting to host=192.168.2.69 with error: connection to server at "192.168.2.69", port 5432 failed: No route to host
	Is the server running on that host and accepting TCP/IP connections?

And that's it, now on the API I'm getting this:

curl -i http://127.0.0.1:2020/api/v2/health
HTTP/1.1 200 OK
Server: Monkey/1.7.0
Date: Wed, 26 Apr 2023 10:53:51 GMT
Transfer-Encoding: chunked
Content-Type: application/json

{"fluent-bit":{"version":"2.1.0","edition":"Community","flags":["FLB_HAVE_IN_STORAGE_BACKLOG","FLB_HAVE_CHUNK_TRACE","FLB_HAVE_PARSER","FLB_HAVE_RECORD_ACCESSOR","FLB_HAVE_STREAM_PROCESSOR","FLB_HAVE_TLS","FLB_HAVE_OPENSSL","FLB_HAVE_METRICS","FLB_HAVE_WASM","FLB_HAVE_AWS","FLB_HAVE_AWS_CREDENTIAL_PROCESS","FLB_HAVE_SIGNV4","FLB_HAVE_SQLDB","FLB_HAVE_METRICS","FLB_HAVE_HTTP_SERVER","FLB_HAVE_SYSTEMD","FLB_HAVE_VALGRIND","FLB_HAVE_FORK","FLB_HAVE_TIMESPEC_GET","FLB_HAVE_GMTOFF","FLB_HAVE_UNIX_SOCKET","FLB_HAVE_ATTRIBUTE_ALLOC_SIZE","FLB_HAVE_PROXY_GO","FLB_HAVE_LIBBACKTRACE","FLB_HAVE_REGEX","FLB_HAVE_UTF8_ENCODER","FLB_HAVE_LUAJIT","FLB_HAVE_C_TLS","FLB_HAVE_ACCEPT4","FLB_HAVE_INOTIFY","FLB_HAVE_GETENTROPY","FLB_HAVE_GETENTROPY_SYS_RANDOM"]}}

So, how someone will notice that there's an error if everything start without problem? in my opinion a daemon should fail if there's anything that may cause an error, just like inside a pod/container, it will be restarted until something it's fixed, this it's more like making an error kind of invisible.

On the other hand, which plugins in Fluent-Bit behaves like what you want to add? Can you elaborate more on why we should implement this? I may be wrong so I want to read more

Thanks in advance!

Hi, I can give some some examples to explain how other out_plugins behave when user give wrong configuration.

example 1(a.config):

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

[INPUT]
    name     cpu
    interval_sec 5
    tag      cpu.local

[OUTPUT]
    name http
    host 192.1.1.1
    match  *
    retry_limit 1

[OUTPUT]
    name influxdb
    host 192.2.3.5
    port 8086
    bucket org
    match *
    retry_limit 1

[OUTPUT]
    name opensearch
    host 192.2.2.2
    port 9200
    match *
    retry_limit 1

running config file a.config for several minutes, fluent-bit will not crash, then you will get the following logs: image

example 2(b.config):

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

[INPUT]
    name     cpu
    interval_sec 5
    tag      cpu.local

[OUTPUT]
    name            pgsql
    host            192.168.142.132
    port            15432
    user            postgres
    password        yourpwd
    match           *
    database        agent
    table           test

[OUTPUT]
    name stdout
    match *

running config file b.config, fluent-bit will exit directly, even if user give correct configuration for the other out_plugins(stdout) and you will get following logs: image

example 3(c.config) out_pgsql running with daemon mode:

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

[INPUT]
    name     cpu
    interval_sec 5
    tag      cpu.local

[OUTPUT]
    name http
    host 192.1.1.1
    match  *
    retry_limit 1

[OUTPUT]
    name influxdb
    host 192.2.3.5
    port 8086
    bucket org
    match *
    retry_limit 1

[OUTPUT]
    name opensearch
    host 192.2.2.2
    port 9200
    match *
    retry_limit 1

[OUTPUT]
    name            pgsql
    host            192.168.142.132
    port            15432
    user            postgres
    password        yourpwd
    match           *
    database        agent
    table           test
    retry_limit     1
    Daemon          True

running config file c.config for several minutes, fluent-bit will not crash, then you will get the following logs: image

TomlinfreeGit avatar Apr 27 '23 01:04 TomlinfreeGit

@TomlinfreeGit Testing this after a couple of minutes, about 10 minutes, I don't see any error in the logs saying that something needs to be fixed, the plugin just finished, doesn't look like a desire behavior since everything will work and will start, but if I can't flush the logs to the desired database this may be an issue.

This it's what I can see in the logs:

Fluent Bit v2.1.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/04/26 12:50:03] [error] [output:pgsql:pgsql.0] failed connecting to host=192.168.2.69 with error: connection to server at "192.168.2.69", port 5432 failed: No route to host
	Is the server running on that host and accepting TCP/IP connections?

And that's it, now on the API I'm getting this:

curl -i http://127.0.0.1:2020/api/v2/health
HTTP/1.1 200 OK
Server: Monkey/1.7.0
Date: Wed, 26 Apr 2023 10:53:51 GMT
Transfer-Encoding: chunked
Content-Type: application/json

{"fluent-bit":{"version":"2.1.0","edition":"Community","flags":["FLB_HAVE_IN_STORAGE_BACKLOG","FLB_HAVE_CHUNK_TRACE","FLB_HAVE_PARSER","FLB_HAVE_RECORD_ACCESSOR","FLB_HAVE_STREAM_PROCESSOR","FLB_HAVE_TLS","FLB_HAVE_OPENSSL","FLB_HAVE_METRICS","FLB_HAVE_WASM","FLB_HAVE_AWS","FLB_HAVE_AWS_CREDENTIAL_PROCESS","FLB_HAVE_SIGNV4","FLB_HAVE_SQLDB","FLB_HAVE_METRICS","FLB_HAVE_HTTP_SERVER","FLB_HAVE_SYSTEMD","FLB_HAVE_VALGRIND","FLB_HAVE_FORK","FLB_HAVE_TIMESPEC_GET","FLB_HAVE_GMTOFF","FLB_HAVE_UNIX_SOCKET","FLB_HAVE_ATTRIBUTE_ALLOC_SIZE","FLB_HAVE_PROXY_GO","FLB_HAVE_LIBBACKTRACE","FLB_HAVE_REGEX","FLB_HAVE_UTF8_ENCODER","FLB_HAVE_LUAJIT","FLB_HAVE_C_TLS","FLB_HAVE_ACCEPT4","FLB_HAVE_INOTIFY","FLB_HAVE_GETENTROPY","FLB_HAVE_GETENTROPY_SYS_RANDOM"]}}

So, how someone will notice that there's an error if everything start without problem? in my opinion a daemon should fail if there's anything that may cause an error, just like inside a pod/container, it will be restarted until something it's fixed, this it's more like making an error kind of invisible.

On the other hand, which plugins in Fluent-Bit behaves like what you want to add? Can you elaborate more on why we should implement this? I may be wrong so I want to read more

Thanks in advance!

Thanks for your kind review. In your comment, I guess the you didn't turn on the health check function, you can turn on by config the "health_check=on":

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

and call http://127.0.0.1:2020/api/v1/health to notice the health status of fluent-bit https://docs.fluentbit.io/manual/administration/monitoring#health-check-for-fluent-bit

TomlinfreeGit avatar Apr 27 '23 02:04 TomlinfreeGit

@TomlinfreeGit Hi! I was on holidays for the last month so I didn't look into this, I'm getting into this today

sxd avatar May 22 '23 07:05 sxd

please do help for a review

@TomlinfreeGit Hi! I was on holidays for the last month so I didn't look into this, I'm getting into this today

TomlinfreeGit avatar Jul 28 '23 02:07 TomlinfreeGit

@sxd Hi please help do a review, if possible, thanks very much!

TomlinfreeGit avatar Aug 28 '23 08:08 TomlinfreeGit

@sxd can you review this, looks ready to go if we can fix the checks and review, docs PR available and waiting https://github.com/fluent/fluent-bit-docs/pull/1081

eschabell avatar Oct 21 '25 13:10 eschabell

@cosmo0920 maybe you would be able to take a look at this PR? Waiting on others is not getting it done. ;-)

eschabell avatar Oct 29 '25 13:10 eschabell