fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

bin: run additional validation for --dry-run

Open ahayworth opened this issue 1 month ago • 7 comments

There are certain kinds of invalid configurations that pass the --dry-run check, but fail to start at run time. This commit addresses the issue by re-using some of the hot reload validation logic.

For example, this config is trivially invalid, passes --dry-run, and fails at runtime:

pipeline:
  inputs:
  - name: dummy
    tag: test
    invalid_property_that_does_not_exist: some_value
  outputs:
  - name: stdout
    match: '*'
zsh ❮ fluent-bit --dry-run -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.1.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

configuration test is successful

zsh ❮ fluent-bit -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.1.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/10/31 16:05:55.268532000] [ info] [fluent bit] version=4.1.1, commit=, pid=21037
[2025/10/31 16:05:55.268808000] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/10/31 16:05:55.269140000] [ info] [simd    ] disabled
[2025/10/31 16:05:55.269147000] [ info] [cmetrics] version=1.0.5
[2025/10/31 16:05:55.269407000] [ info] [ctraces ] version=0.6.6
[2025/10/31 16:05:55.269476000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/10/31 16:05:55.269486000] [ help] try the command: fluent-bit -i dummy -h

[2025/10/31 16:05:55.269515000] [error] [engine] input initialization failed

With this commit, we can see that the additional validation from hot reload catches the error right away:

zsh ❯ bin/fluent-bit --dry-run -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.2.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/10/31 16:07:50.402568000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/10/31 16:07:50.402792000] [ help] try the command: bin/fluent-bit -i dummy -h

[2025/10/31 16:07:50.402800000] [error] [reload] check properties for input plugins is failed

(The logs of course now say [reload], which is a little misleading... we can clean that up, if desired).


Enter [N/A] in the box, if an item is not applicable to your change.

Testing Before we can approve your change; please submit the following in a comment:

  • [x] Example configuration file for the change (included in the commit/PR description)
  • [x] Debug log output from testing the change (included in the commit/PR description)
  • [ ] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Dry-run now performs explicit configuration validation before exit, always runs cleanup, and returns failure when unknown or invalid config properties are detected, providing clearer failure signals during config testing.
  • Tests

    • Added an automated test that verifies dry-run fails for invalid configuration properties and reports the unknown-property and validation failure messages.

✏️ Tip: You can customize this high-level summary in your review settings.

ahayworth avatar Oct 31 '25 21:10 ahayworth

Walkthrough

In --dry-run mode, fluent-bit now calls flb_reload_property_check_all(config) before cleanup, stores its return value, always performs cleanup, and exits non-zero on validation failure or prints a success message and exits normally on success.

Changes

Cohort / File(s) Change Summary
Dry-run validation logic
src/fluent-bit.c
Call flb_reload_property_check_all(config) in the --dry-run path, capture its return value, always run cleanup (destroy config/options/context), and exit non-zero on validation failure or print success and exit normally on success.
New dry-run invalid-property test
tests/runtime_shell/dry_run_invalid_property.sh
Added a shell test that writes a temp YAML with an invalid property, runs fluent-bit with --dry-run, asserts a non-zero exit code, checks for unknown-property and reload-validation failure messages, and cleans up temporary files.
Test inclusion
tests/runtime_shell/CMakeLists.txt
Added dry_run_invalid_property.sh to UNIT_TESTS_SH so the new test is discovered and executed by the test harness.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant FB as "fluent-bit (--dry-run)"
    participant Validator as "flb_reload_property_check_all"

    User->>FB: Start fluent-bit with --dry-run
    Note over FB: Parse config, init env/options/context
    FB->>Validator: flb_reload_property_check_all(config)
    alt Validator returns non-zero
        Validator-->>FB: non-zero (validation failure)
        Note over FB: Perform cleanup (destroy config/options/context)
        FB-->>User: Exit non-zero (print validation failure)
    else Validator returns zero
        Validator-->>FB: zero (validation success)
        Note over FB: Perform cleanup (destroy config/options/context)
        FB-->>User: Print "configuration test succeeded" and exit zero
    end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Pay attention to:
    • Correct propagation of the validation return value after cleanup.
    • Any side effects or resource lifetime implications of calling flb_reload_property_check_all() before teardown.
    • Robustness of the new shell test (temporary file handling and message matching).

Suggested reviewers

  • edsiper
  • koleini
  • fujimotos

Poem

🐰 I hop through configs, sniff each YAML line,

I flag the odd property, then tidy the mine.
I clean every burrow, leave no stray byte,
If something's awry I thump — if not, all's right. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'bin: run additional validation for --dry-run' directly and clearly describes the main change: adding validation logic to the --dry-run feature to catch invalid configurations earlier.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9ef91de4c7380ad22049f430800aa3d8d41cdb57 and 31924ee14fe6eb7b7154dba6f8e02fcda263ad67.

📒 Files selected for processing (3)
  • src/fluent-bit.c (1 hunks)
  • tests/runtime_shell/CMakeLists.txt (1 hunks)
  • tests/runtime_shell/dry_run_invalid_property.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/runtime_shell/CMakeLists.txt
  • tests/runtime_shell/dry_run_invalid_property.sh
  • src/fluent-bit.c

[!TIP]

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions: | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context. Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Oct 31 '25 21:10 coderabbitai[bot]

@codex review

edsiper avatar Nov 01 '25 15:11 edsiper

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Hi, could you rebase off master? After rebasing off the current master, I can confirm like as a below even if on macOS:

% bin/fluent-bit -c  invalid_config.yaml --dry-run
Fluent Bit v4.2.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __  
|  ___| |                | |   | ___ (_) |           /   | /  | 
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| | 
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | | 
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/


[2025/11/07 15:17:51.675942000] [ info] Configuration:
[2025/11/07 15:17:51.675948000] [ info]  flush time     | 1.000000 seconds
[2025/11/07 15:17:51.675952000] [ info]  grace          | 5 seconds
[2025/11/07 15:17:51.675954000] [ info]  daemon         | 0
[2025/11/07 15:17:51.675955000] [ info] ___________
[2025/11/07 15:17:51.675957000] [ info]  inputs:
[2025/11/07 15:17:51.675959000] [ info]      dummy
[2025/11/07 15:17:51.675960000] [ info] ___________
[2025/11/07 15:17:51.675962000] [ info]  filters:
[2025/11/07 15:17:51.675963000] [ info] ___________
[2025/11/07 15:17:51.675965000] [ info]  outputs:
[2025/11/07 15:17:51.675966000] [ info]      stdout.0
[2025/11/07 15:17:51.675968000] [ info] ___________
[2025/11/07 15:17:51.675969000] [ info]  collectors:
[2025/11/07 15:17:51.675986000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/11/07 15:17:51.675991000] [ help] try the command: bin/fluent-bit -i dummy -h

[2025/11/07 15:17:51.675994000] [error] [reload] check properties for input plugins is failed

cosmo0920 avatar Nov 07 '25 06:11 cosmo0920

@patrick-stephens @cosmo0920 I've rebased and added a test - I'm not sure how extensive we'd like test coverage to be here, but if needed I can add additional cases.

ahayworth avatar Nov 10 '25 15:11 ahayworth

@ahayworth Could you rebase off master? I fixed CI issue by no left device space. So, we need to confirm that all of CI tasks is green.

cosmo0920 avatar Nov 19 '25 09:11 cosmo0920

@cosmo0920 Absolutely, rebased!

ahayworth avatar Nov 20 '25 13:11 ahayworth