fluent-bit
fluent-bit copied to clipboard
out_es: add cloud_apikey configuration
Adds Elastic Cloud API Key support to the out_es
plugin. This patch adds a new config option, cloud_apikey
, which would be added to the HTTP request through the Authorization: Apikey <cloud_apikey>
header.
Addresses #6727. While we can re-use the cloud_auth
config option, we would have to make additional assumptions on the API Key to identify it properly (i.e. does does not contain :
, is base64 encoded, etc).
Enter [N/A]
in the box, if an item is not applicable to your change.
Testing Before we can approve your change; please submit the following in a comment:
- [x] Example configuration file for the change
- [x] Debug log output from testing the change
- [x] Attached Valgrind output that shows no leaks or memory corruption was found
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
- [N/A] Run local packaging test showing all targets (including any new ones) build.
- [N/A] Set
ok-package-test
label to test for all targets (requires maintainer to do).
Documentation
- [x] Documentation required for this feature
https://github.com/fluent/fluent-bit-docs/pull/1213
Backporting
- [N/A] Backport to latest stable release.
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Example configuration
[SERVICE]
Flush 1
Daemon off
Log_Level debug
[INPUT]
Name cpu
[OUTPUT]
Name stdout
Match *
[OUTPUT]
Name es
Match *
tls On
tls.verify Off
Cloud_Id <redacted>
Cloud_Apikey <redacted>
Suppress_Type_Name On
Debug output and Valgrind
$ valgrind ./bin/fluent-bit -c es.conf
==70111== Memcheck, a memory error detector
==70111== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==70111== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==70111== Command: ./bin/fluent-bit -c es.conf
==70111==
Fluent Bit v2.1.10
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2023/09/18 02:55:15] [ info] Configuration:
[2023/09/18 02:55:15] [ info] flush time | 1.000000 seconds
[2023/09/18 02:55:15] [ info] grace | 5 seconds
[2023/09/18 02:55:15] [ info] daemon | 0
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info] inputs:
[2023/09/18 02:55:15] [ info] cpu
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info] filters:
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info] outputs:
[2023/09/18 02:55:15] [ info] stdout.0
[2023/09/18 02:55:15] [ info] es.1
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info] collectors:
[2023/09/18 02:55:15] [ info] [fluent bit] version=2.1.10, commit=b777d90050, pid=70111
[2023/09/18 02:55:15] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2023/09/18 02:55:15] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/09/18 02:55:15] [ info] [cmetrics] version=0.6.3
[2023/09/18 02:55:15] [ info] [output:stdout:stdout.0] worker #0 started
[2023/09/18 02:55:15] [ info] [ctraces ] version=0.3.1
[2023/09/18 02:55:15] [ info] [input:cpu:cpu.0] initializing
[2023/09/18 02:55:15] [ info] [input:cpu:cpu.0] storage_strategy='memory' (memory only)
[2023/09/18 02:55:15] [debug] [cpu:cpu.0] created event channels: read=21 write=22
[2023/09/18 02:55:15] [debug] [stdout:stdout.0] created event channels: read=23 write=24
[2023/09/18 02:55:15] [debug] [es:es.1] created event channels: read=30 write=31
[2023/09/18 02:55:16] [debug] [output:es:es.1] extracted cloud_host: '<redacted>'
[2023/09/18 02:55:16] [debug] [output:es:es.1] cloud_host: '<redacted>' does not contain a port: '<redacted>'
[2023/09/18 02:55:16] [ info] [output:es:es.1] worker #1 started
[2023/09/18 02:55:16] [ info] [output:es:es.1] worker #0 started
[2023/09/18 02:55:16] [debug] [output:es:es.1] checked whether extracted port was null and set it to default https port or not. Outcome: '443' and cloud_host: '<redacted>'.
[2023/09/18 02:55:16] [debug] [output:es:es.1] host=<redacted> port=443 uri=/_bulk index=fluent-bit type=_doc
[2023/09/18 02:55:16] [debug] [router] match rule cpu.0:stdout.0
[2023/09/18 02:55:16] [debug] [router] match rule cpu.0:es.1
[2023/09/18 02:55:16] [ info] [sp] stream processor started
[2023/09/18 02:55:17] [debug] [input chunk] update output instances with new chunk size diff=207, records=1, input=cpu.0
^C[2023/09/18 02:55:17] [engine] caught signal (SIGINT)
[2023/09/18 02:55:17] [debug] [task] created task=0x5322eb0 id=0 OK
[0] cpu.0: [[1695005716.948405812, {}], {"cpu_p"=>34.000000, "user_p"=>33.000000, "system_p"=>1.000000, "cpu0.p_cpu"=>7.000000, "cpu0.p_user"=>6.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>62.000000, "cpu1.p_user"=>61.000000, "cpu1.p_system"=>1.000000}]
[2023/09/18 02:55:17] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2023/09/18 02:55:17] [debug] [out flush] cb_destroy coro_id=0
[2023/09/18 02:55:17] [debug] [output:es:es.1] task_id=0 assigned to thread #0
[2023/09/18 02:55:17] [ warn] [engine] service will shutdown in max 5 seconds
[2023/09/18 02:55:17] [ info] [input] pausing cpu.0
[2023/09/18 02:55:18] [ info] [task] cpu/cpu.0 has 1 pending task(s):
[2023/09/18 02:55:18] [ info] [task] task_id=0 still running on route(s): stdout/stdout.0 es/es.1
[2023/09/18 02:55:18] [ info] [input] pausing cpu.0
[2023/09/18 02:55:18] [debug] [upstream] KA connection #60 to <redacted>:443 is connected
[2023/09/18 02:55:18] [debug] [http_client] not using http_proxy for header
[2023/09/18 02:55:18] [debug] [output:es:es.1] using elastic cloud apikey
[2023/09/18 02:55:18] [debug] [output:es:es.1] HTTP Status=200 URI=/_bulk
[2023/09/18 02:55:18] [debug] [output:es:es.1] Elasticsearch response
{"took":31,"errors":false,"items":[{"create":{"_index":"fluent-bit","_id":"fko2pooBqbJxt1RguQ8l","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":31,"_primary_term":1,"status":201}}]}
[2023/09/18 02:55:18] [debug] [upstream] KA connection #60 to <redacted>:443 is now available
[2023/09/18 02:55:18] [debug] [task] destroy task=0x5322eb0 (task_id=0)
[2023/09/18 02:55:18] [debug] [out flush] cb_destroy coro_id=0
[2023/09/18 02:55:18] [ info] [input] pausing cpu.0
[2023/09/18 02:55:19] [ info] [engine] service has stopped (0 pending tasks)
[2023/09/18 02:55:19] [ info] [input] pausing cpu.0
[2023/09/18 02:55:19] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/09/18 02:55:19] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #0 stopping...
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #0 stopped
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #1 stopping...
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #1 stopped
==70111==
==70111== HEAP SUMMARY:
==70111== in use at exit: 0 bytes in 0 blocks
==70111== total heap usage: 18,885 allocs, 18,885 frees, 2,765,200 bytes allocated
==70111==
==70111== All heap blocks were freed -- no leaks are possible
==70111==
==70111== For lists of detected and suppressed errors, rerun with: -s
==70111== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
@patrick-stephens could you assist to re-run the integration tests? not quite sure why the integration runs have been failed the first time round. I've rebased master
@patrick-stephens could you assist to re-run the integration tests? not quite sure why the integration runs have been failed the first time round. I've rebased master
Do you mean unit tests? Integration tests are not run unless this is labelled. macOS unit tests are flaky at the moment I believe so can be ignored as long as Linux passes.
Do you mean unit tests? Integration tests are not run unless this is labelled. macOS unit tests are flaky at the moment I believe so can be ignored as long as Linux passes.
Ah, that's what I meant. I noticed the failing macOS test and wasn't sure if that was the blocker for the PR.
What would be the next steps to move this PR forward?
It's on the codeowners to review so will be in the queue.
Can we extend this feature?
- The header name should be dynamic. There are many cases when other headers are used, for example Bearer header instead of apiKey header.
- The header value should be taken dynamically from a file instead of static value. The file can be dynamically updated when a value/token is updated/refreshed.
The header name should be dynamic. There are many cases when other headers are used, for example Bearer header instead of apiKey header.
Could you elaborate a use case where the Bearer header is used in the context of elasticsearch? This change in particular is to support integration with Elastic Cloud via API Keys (see https://www.elastic.co/guide/en/cloud/current/ec-api-authentication.html)
Regardless, I'm not quite sure that allowing users to specify arbitrary authorization headers is ideal, especially if the set of the allowable authorization type for the plugin could be well defined.
The header value should be taken dynamically from a file instead of static value. The file can be dynamically updated when a value/token is updated/refreshed.
Looking at other fluentbit output plugins, this does not appear to be a common pattern. (The only exception seems to be Google Cloud Credential json, which seem to contain quite a bit of auth information, which would probably not be the norm). I would be hesistant to make this change in this PR without maintainers' inputs, since this feels like a config design change that would also be applicable to other plugins.
I think personally I would be of the opinion to keep things simple in a PR, land one feature before adding more.
Hi On the other hand the ElasticSearch supports a JWT token as a bearer authorization header and probably other methods. So why not to support universally any http header set by the user as a env variable or as a file containing this header (for security reasons)
We do a similar thing with Prometheus sending metrics to a remote store. See the authorization section here
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
Hi any updates on merging this? This feature will be of great help ❤️
that's a shame, i won't be able to use fluent bit because it does not support sending logs using elastic api keys
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.
rebased
@patrick-stephens @edsiper @PettitWesley could you remove the stale label?
adding the comment to move it from the stale state.
What ar the current blockers? as this is really long avaited feature. Thank you @soedar for making this.