out_azure_blob: add log_key option
This PR is based on PR #3668 but addresses Azure blob storage. The azure_blob plugin was modify to accept 'log_key' option. By default the entire log record is sent to storage. When 'log_key' option is specified in the output plugin configuration, then only the value of the key is sent to the storage blob.
Addresses #9721
Enter [N/A] in the box, if an item is not applicable to your change.
Testing Before we can approve your change; please submit the following in a comment:
- [x] Example configuration file for the change
- [x] Debug log output from testing the change
- [x] Attached Valgrind output that shows no leaks or memory corruption was found
Documentation
- [x] Documentation required for this feature
Doc PR https://github.com/fluent/fluent-bit-docs/pull/1540
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
By default the entire record is sent to azure blob storage. Here is an example of a sample configuration and default output
Configuration
[SERVICE]
flush 1
log_level info
[INPUT]
name dummy
dummy {"name": "Fluent Bit", "year": 2020}
samples 1
tag var.log.containers.app-default-96cbdef2340.log
[OUTPUT]
name azure_blob
match *
account_name twilk123
shared_key <snip>
path kubernetes
container_name test-container
auto_create_container on
tls on
Record without log_key
{"@timestamp":"2025-01-02T16:56:02.906357Z","name":"Fluent Bit","year":2020}
if the 'log_key' is specified then only the specific key value is sent to azure blob storage
Sample configuration with log_key
[SERVICE]
flush 1
log_level info
[INPUT]
name dummy
dummy {"name": "Fluent Bit", "year": 2020}
samples 1
tag var.log.containers.app-default-96cbdef2340.log
[OUTPUT]
name azure_blob
match *
account_name twilk123
shared_key <snip>
path kubernetes
container_name test-container
auto_create_container on
tls on
log_key name
Record with log_key set to name
Fluent Bit
Example Valgrind output
root@fluent-bit:/tmp# valgrind ./fluent-bit -c azure.conf
==3022== Memcheck, a memory error detector
==3022== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==3022== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==3022== Command: ./fluent-bit -c azure.conf
==3022==
Fluent Bit v3.2.3
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
______ _ _ ______ _ _ _____ _____
| ___| | | | | ___ (_) | |____ |/ __ \
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __ / /`' / /'
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / \ \ / /
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /.___/ /./ /___
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ \____(_)_____/
[2025/01/02 19:56:50] [ info] [fluent bit] version=3.2.3, commit=addf261e8c, pid=3022
[2025/01/02 19:56:50] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/01/02 19:56:50] [ info] [simd ] disabled
[2025/01/02 19:56:50] [ info] [cmetrics] version=0.9.9
[2025/01/02 19:56:50] [ info] [ctraces ] version=0.5.7
[2025/01/02 19:56:51] [ info] [output:azure_blob:azure_blob.0] initializing worker
[2025/01/02 19:56:50] [ info] [input:dummy:dummy.0] initializing
[2025/01/02 19:56:51] [ info] [output:azure_blob:azure_blob.0] worker #0 started
[2025/01/02 19:56:50] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2025/01/02 19:56:51] [ info] [output:azure_blob:azure_blob.0] account_name=twilk123, container_name=test-container, blob_type=appendblob, emulator_mode=no, endpoint=twilk123.blob.core.windows.net, auth_type=key
[2025/01/02 19:56:51] [ info] [sp] stream processor started
[2025/01/02 19:56:54] [ info] [output:azure_blob:azure_blob.0] container 'test-container' already exists
[2025/01/02 19:56:54] [ info] [output:azure_blob:azure_blob.0] content uploaded successfully:
[2025/01/02 19:56:54] [ info] [output:azure_blob:azure_blob.0] blob id (null) committed successfully
^C[2025/01/02 19:57:03] [engine] caught signal (SIGINT)
[2025/01/02 19:57:03] [ warn] [engine] service will shutdown in max 5 seconds
[2025/01/02 19:57:03] [ info] [input] pausing dummy.0
[2025/01/02 19:57:03] [ info] [engine] service has stopped (0 pending tasks)
[2025/01/02 19:57:03] [ info] [input] pausing dummy.0
[2025/01/02 19:57:03] [ info] [output:azure_blob:azure_blob.0] thread worker #0 stopping...
[2025/01/02 19:57:03] [ info] [output:azure_blob:azure_blob.0] initializing worker
[2025/01/02 19:57:03] [ info] [output:azure_blob:azure_blob.0] thread worker #0 stopped
==3022==
==3022== HEAP SUMMARY:
==3022== in use at exit: 0 bytes in 0 blocks
==3022== total heap usage: 17,894 allocs, 17,894 frees, 2,471,158 bytes allocated
==3022==
==3022== All heap blocks were freed -- no leaks are possible
==3022==
==3022== For lists of detected and suppressed errors, rerun with: -s
==3022== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Addresses #9721
Summary by CodeRabbit
- New Features
- Added log_key configuration option for Azure Blob Storage output plugin that allows extracting a specific field from incoming log records; when configured, only the value of the designated key will be sent to Azure Blob Storage.
✏️ Tip: You can customize this high-level summary in your review settings.
@edsiper Can you please give us an update?
memory leak test after rewrite:
$ valgrind build/bin/fluent-bit -c fluentbit.cfg
==225827== Memcheck, a memory error detector
==225827== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==225827== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==225827== Command: build/bin/fluent-bit -c fluentbit.cfg
==225827==
Fluent Bit v4.0.3
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
______ _ _ ______ _ _ ___ _____
| ___| | | | | ___ (_) | / || _ |
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __/ /| || |/' |
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| || /| |
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /\___ |\ |_/ /
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ |_(_)___/
[2025/06/11 14:22:02] [ info] [fluent bit] version=4.0.3, commit=97285bdd2a, pid=225827
[2025/06/11 14:22:03] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/06/11 14:22:03] [ info] [simd ] disabled
[2025/06/11 14:22:03] [ info] [cmetrics] version=1.0.2
[2025/06/11 14:22:03] [ info] [ctraces ] version=0.6.6
[2025/06/11 14:22:03] [ info] [input:dummy:dummy.0] initializing
[2025/06/11 14:22:03] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2025/06/11 14:22:03] [ info] [output:azure_blob:azure_blob.0] account_name=devstoreaccount1, container_name=logs, blob_type=appendblob, emulator_mode=yes, endpoint=http://127.0.0.1
:10000, auth_type=key
[2025/06/11 14:22:03] [ info] [sp] stream processor started
[2025/06/11 14:22:03] [ info] [output:azure_blob:azure_blob.0] initializing worker
[2025/06/11 14:22:03] [ info] [output:azure_blob:azure_blob.0] worker #0 started
[2025/06/11 14:22:05] [ info] [output:azure_blob:azure_blob.0] container 'logs' already exists
[2025/06/11 14:22:05] [ info] [output:azure_blob:azure_blob.0] content uploaded successfully:
[2025/06/11 14:22:05] [ info] [output:azure_blob:azure_blob.0] blob id (null) committed successfully
^C[2025/06/11 14:22:18] [engine] caught signal (SIGINT)
[2025/06/11 14:22:18] [ warn] [engine] service will shutdown in max 5 seconds
[2025/06/11 14:22:18] [ info] [input] pausing dummy.0
[2025/06/11 14:22:18] [ info] [engine] service has stopped (0 pending tasks)
[2025/06/11 14:22:18] [ info] [input] pausing dummy.0
[2025/06/11 14:22:18] [ info] [output:azure_blob:azure_blob.0] thread worker #0 stopping...
[2025/06/11 14:22:18] [ info] [output:azure_blob:azure_blob.0] initializing worker
==225827==
==225827== HEAP SUMMARY:
==225827== in use at exit: 0 bytes in 0 blocks
==225827== total heap usage: 7,292 allocs, 7,292 frees, 1,413,601 bytes allocated
==225827==
==225827== All heap blocks were freed -- no leaks are possible
==225827==
==225827== For lists of detected and suppressed errors, rerun with: -s
==225827== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Hi @tomekwilk @lockewritesdocs, Just checking in, are there any blockers preventing this PR from being merged? Let me know if there's anything I can do to help move it forward.
Walkthrough
The Azure Blob output plugin has been enhanced with a log_key feature that enables extraction of a specific field from incoming msgpack data. The azure_blob_format function signature has been expanded to accept additional metadata and context parameters, complemented by a new helper function that handles field extraction and type conversion when log_key is configured.
Changes
| Cohort / File(s) | Summary |
|---|---|
Log key extraction and formatting logic plugins/out_azure_blob/azure_blob.c |
Added static helper function cb_azb_msgpack_extract_log_key to locate and extract a specific field from msgpack data via record accessor, with support for string, float, and int types. Updated azure_blob_format signature to accept additional parameters (flush context, event type, tag, data payload) and return formatted output via pointer parameters. Function now conditionally routes through log key extraction when configured. Added new header includes: flb_record_accessor.h and flb_ra_key.h. |
Configuration support plugins/out_azure_blob/azure_blob.c |
Added log_key configuration entry to the public config map for struct flb_azure_blob, exposing the field as a string configuration option. |
Data structure plugins/out_azure_blob/azure_blob.h |
Added flb_sds_t log_key field to struct flb_azure_blob for storing the configured log key. |
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
- Record accessor API usage: Verify correct usage of record accessor to locate and extract fields from msgpack data
- Type conversion logic: Review string/float/int conversion paths and null-termination handling in the helper function
- Function signature impact: Trace how the expanded
azure_blob_formatsignature integrates with plugin callback mechanisms and any callers - Error handling paths: Ensure robust cleanup and error reporting for missing fields, unsupported types, and allocation failures
Suggested reviewers
- leonardo-albertovich
- koleini
- fujimotos
Poem
🐰 A key to unlock the log's hidden treasure, Extract the field at a rabbit's own pleasure, Msgpack data flows through a new winding way, The blob stores the truth that we seek every day! 🌿
Pre-merge checks and finishing touches
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title clearly and concisely summarizes the main change: adding a log_key option to the azure_blob output plugin. |
| Docstring Coverage | ✅ Passed | No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
I rebased the PR to resolve the merge conflicts after recent master changes. This PR is waiting to be re-reviewed and merged. Not sure if there is anything else for me to do.
Hello @edsiper , @adrinaula ,
This PR tackles an issue that we've also recently faced. Any idea if there are anything preventing/blocking the merger?
Would be interested to contribute if need be :) .
Thanks in Advance,
@tomekwilk Eduardo requested a change, can you take a look at fixing?
@tomekwilk Eduardo requested a change, can you take a look at fixing?
which change are we talking about ? this one ? flb_errno() needs to be called before flb_plg_error()
If we can help in any way don't hesitate, we have the exact same requirement but we don't want to create a new PR that does exactly what @tomekwilk did...
I fixed one place where flb_errno() was after flb_plg_error() and rebased the PR. Not sure what else can be blocking this PR. I requested re-review after addressing the initial comments but heard nothing back.
If anyone would like to help push this PR forward or verify the change feel free, it would be appreciated. I am currently traveling and have limited access. Thanks!
@eschabell @edsiper what's missing to validate this PR ?
Hello, When will this fix be released ?
@eschabell @edsiper what's missing to validate this PR ?
Hey @overmeulen looks like it's waiting on user changes requested by reviewer?
Removed log_key cleanup from flb_azure_blob_conf_destroy(). This change was suggested by the coderabbitai and was causing double free error. log_key is part of the config map and is freed when plugin instance is destroyed.
Here is the updated volgrind output. I believe that the error below is not related to this PR.
~/fluent-bit/build (dev-vm-461200)$ valgrind bin/fluent-bit -c fluentbit.conf
==22945== Memcheck, a memory error detector
==22945== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==22945== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==22945== Command: bin/fluent-bit -c fluentbit.conf
==22945==
Fluent Bit v4.2.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io
______ _ _ ______ _ _ ___ _____
| ___| | | | | ___ (_) | / | / __ \
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __/ /| | `' / /'
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| | / /
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /\___ |_./ /___
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ |_(_)_____/
Fluent Bit v4.2 – Direct Routes Ahead
Celebrating 10 Years of Open, Fluent Innovation!
[2025/12/03 17:21:41.239025245] [ info] [fluent bit] version=4.2.1, commit=d9749a9eff, pid=22945
[2025/12/03 17:21:41.298763381] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/12/03 17:21:41.299828308] [ info] [simd ] disabled
[2025/12/03 17:21:41.512516957] [ info] [output:azure_blob:azure_blob.0] initializing worker
[2025/12/03 17:21:41.300483317] [ info] [cmetrics] version=1.0.5
[2025/12/03 17:21:41.515441205] [ info] [output:azure_blob:azure_blob.0] worker #0 started
[2025/12/03 17:21:41.301077789] [ info] [ctraces ] version=0.6.6
[2025/12/03 17:21:41.332743693] [ info] [input:dummy:dummy.0] initializing
[2025/12/03 17:21:41.334188627] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2025/12/03 17:21:41.421599808] [ info] [output:azure_blob:azure_blob.0] account_name=devstoreaccount1, container_name=test-container, b
lob_type=appendblob, emulator_mode=yes, endpoint=http://127.0.0.1:10000, auth_type=key
[2025/12/03 17:21:41.484604743] [ info] [sp] stream processor started
[2025/12/03 17:21:41.489661070] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
==22945== Warning: client switching stacks? SP change: 0x7b42548 --> 0x6062190
==22945== to suppress, use: --max-stackframe=28181432 or greater
==22945== Warning: client switching stacks? SP change: 0x6062078 --> 0x7b42548
==22945== to suppress, use: --max-stackframe=28181712 or greater
==22945== Warning: client switching stacks? SP change: 0x7b42548 --> 0x6062078
==22945== to suppress, use: --max-stackframe=28181712 or greater
==22945== further instances of this message will not be shown.
[2025/12/03 17:21:43.625144119] [ info] [output:azure_blob:azure_blob.0] container 'test-container' already exists
==22945== Thread 4 flb-out-azure_bl:
==22945== Conditional jump or move depends on uninitialised value(s)
==22945== at 0x484F229: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==22945== by 0x531FDA7: __printf_buffer (vfprintf-process-arg.c:435)
==22945== by 0x5344D90: __vsnprintf_internal (vsnprintf.c:96)
==22945== by 0x5344D90: vsnprintf (vsnprintf.c:103)
==22945== by 0x25F9B9: flb_sds_printf (flb_sds.c:357)
==22945== by 0x7BE148: azb_block_blob_uri_commit (azure_blob_blockblob.c:133)
==22945== by 0x7BEC42: azb_block_blob_commit_block (azure_blob_blockblob.c:341)
==22945== by 0x798E06: send_blob (azure_blob.c:637)
==22945== by 0x79F0B0: cb_azure_blob_flush (azure_blob.c:1753)
==22945== by 0x2969DF: output_pre_cb_flush (flb_output.h:706)
==22945== by 0x167C06A: co_init (amd64.c:117)
==22945==
[2025/12/03 17:21:43.686486905] [ info] [output:azure_blob:azure_blob.0] content uploaded successfully:
[2025/12/03 17:21:43.706190168] [ info] [output:azure_blob:azure_blob.0] blob id (null) committed successfully
^C[2025/12/03 17:22:10] [engine] caught signal (SIGINT)
[2025/12/03 17:22:10.573687209] [ warn] [engine] service will shutdown in max 5 seconds
[2025/12/03 17:22:10.574904549] [ info] [engine] pausing all inputs..
[2025/12/03 17:22:10.576506925] [ info] [input] pausing dummy.0
[2025/12/03 17:22:10.889106628] [ info] [engine] service has stopped (0 pending tasks)
[2025/12/03 17:22:10.889598275] [ info] [input] pausing dummy.0
[2025/12/03 17:22:10.894383437] [ info] [output:azure_blob:azure_blob.0] thread worker #0 stopping...
[2025/12/03 17:22:10.896913640] [ info] [output:azure_blob:azure_blob.0] initializing worker
==22945==
==22945== HEAP SUMMARY:
==22945== in use at exit: 0 bytes in 0 blocks
==22945== total heap usage: 8,901 allocs, 8,901 frees, 1,943,861 bytes allocated
==22945==
==22945== All heap blocks were freed -- no leaks are possible
==22945==
==22945== Use --track-origins=yes to see where uninitialised values come from
==22945== For lists of detected and suppressed errors, rerun with: -s
==22945== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
@eschabell I believe that all @edsiper comments were addressed. What am I missing?
@edsiper @cosmo0920 @lecaros can someone look at this as reviewers for @tomekwilk, he's doing his part and waiting on feedback.