logstash icon indicating copy to clipboard operation
logstash copied to clipboard

Exposes average batch metrics at 1, 5 and 15 minutes time window.

Open andsel opened this issue 3 weeks ago • 5 comments

Release notes

Exposes batch size metrics for last 1, 5 and 15 minutes.

What does this PR do?

Updates stats API response to expose also 1m, 5m and 15m average batch metrics.

Changed the response map returned by refine_batch_metrics method as result of API query to _node/stats so tha contains the average values of last 1, 5 and 15 minutes for event_count and batch_size. These data is published once they are available from the metric collector.

Why is it important/What is the impact to the user?

This feature permit to the user of Logstash to have the metering of batch average values over some recent time windows.

Checklist

  • [x] My code follows the style guidelines of this project
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [x] I have made corresponding changes to the documentation
  • ~~[ ] I have made corresponding change to the default configuration files (and/or docker env variables)~~
  • ~~[ ] I have added tests that prove my fix is effective or that my feature works~~ This feature rely on ExtendedFlowMetric which is extensively tested about these time window management. To create a test at API level we should implement something that load for at least the time window duration and check the API response. Test that runs for minutes are not feasible.

Author's Checklist

  • [ ]

How to test this PR locally

Use the same test harness proposed in #18000, switch pipeline.batch.metrics.sampling_mode to full and monitor for 1, 5, and 15 minutes the result of _node/stats with:

curl http://localhost:9600/_node/stats | jq .pipelines.main.batch

Related issues

  • Closes #17998

Use cases

Screenshots

Logs

andsel avatar Dec 04 '25 13:12 andsel

:robot: GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • /run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

github-actions[bot] avatar Dec 04 '25 13:12 github-actions[bot]

This pull request does not have a backport label. Could you fix it @andsel? 🙏 To fixup this pull request, you need to add the backport labels for the needed branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

mergify[bot] avatar Dec 04 '25 13:12 mergify[bot]

run exhaustive test

andsel avatar Dec 04 '25 16:12 andsel

That would be the right point, but verifying at least the 1 minute average flow metric would be that the test need to sleep and wait for 1 minute so that the value pops up in the response. Waiting for 5 or 15 minutes to execute a test would be great waste, WDYT?

andsel avatar Dec 09 '25 15:12 andsel

:green_heart: Build Succeeded

History

  • :green_heart: Build #3931 succeeded aa029b9d1593fd4143948dcd5e5561b16b89ceeb
  • :yellow_heart: Build #3917 was flaky 8f2702c130e9108451810f9e1b2deb53857e9148
  • :broken_heart: Build #3912 failed c22c3849ddba36514c666f1c4e928ae14737be9f
  • :yellow_heart: Build #3911 was flaky e083753f5f5475117c0464b7843a5954323edb9c

cc @andsel

elasticmachine avatar Dec 11 '25 09:12 elasticmachine