beats
beats copied to clipboard
- Bug/Enhancement: Update metricbeat test modules timeout to be configurable
The timeout when running metricbeat test modules <my_module> was intermittently giving an error "ERROR timeout waiting for an event". It seemed to be intermittent since our module was taking between 5 to 6 seconds to respond.
Unfortunately we discovered the timeout was hardcoded to 5 seconds and not configurable. This fix that we tested, will adjust the timeout to be now configurable by setting http timeout, it also appears to default to period if timeout is unset.
Reference Links: https://www.elastic.co/docs/reference/beats/metricbeat/command-line-options#test-command https://www.elastic.co/docs/reference/beats/metricbeat/configuration-metricbeat#_timeout https://www.elastic.co/docs/reference/beats/metricbeat/command-line-options#test-command
Please label this PR with one of the following labels, depending on the scope of your change:
- Bug
- Enhancement
Proposed commit message
Update metricbeat test modules timeout to be configurable
Explain here the changes you made on the PR. Updated the command metricbeat test modules <my_module> timeout value instead of heard-coded to be set to the http timeout value.
Please explain:
-
WHAT: patterns used, algorithms implemented, design architecture, message processing, etc. N/A it is simple fix, one line of code.
-
WHY: the rationale/motivation for the changes We were getting this intermittent error when demoing the testing of modules to client and it made metribeat look odd with random event failures, that were due to hard coded timeout.
See Title
Checklist
- [x ] My code follows the style guidelines of this project
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have made corresponding change to the default configuration files
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] I have added an entry in
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.
Disruptive User Impact
No, it should have no impact.
Author's Checklist
Review single line code change, it was tested and it works as described and is now configurable.
How to test this PR locally
Go into any metricbeat/module.d/my_module.yml that uses http/https and set the yml file timeout at 1s and you should see an error, that we were encountering.
Related issues
None:
Use cases
Command: metricbeat test modules my_module -v -e Test: Prior to code integration, test hardcoded timeout with response time of an http API request with greater than 5 seconds, then build metricbeat based on this one line code change and it should default my_module.yml time period, if timeout is unset, so you can set period to 1s and then it should fail, adjust period to 10s it should work, then set timeout to 1s it should fail, and adjust to 10s and it should work.
Screenshots
N/A
Logs
metricbeat test modules my_module -v -e
{"log.level":"info","@timestamp":"2025-05-18T12:27:29.940-0600","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).configure","file.name":"instance/beat.go","file.line":1062},"message":"Home path: [/Users/danh/github/danh/beats/metricbeat] Config path: [/Users/danh/github/danh/beats/metricbeat] Data path: [/Users/danh/github/danh/beats/metricbeat/data] Logs path: [/Users/danh/github/danh/beats/metricbeat/logs]","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2025-05-18T12:27:29.940-0600","log.origin":{"function":"github.com/elastic/beats/v7/libbeat/cmd/instance.(*Beat).configure","file.name":"instance/beat.go","file.line":1070},"message":"Beat ID: cc12c183-8e7d-40ce-8e7d-5cfd9395df71","service.name":"metricbeat","ecs.version":"1.6.0"}
{"log.level":"warn","@timestamp":"2025-05-18T12:27:30.102-0600","log.logger":"cfgwarn","log.origin":{"function":"github.com/elastic/beats/v7/metricbeat/module/my_module/health.New","file.name":"health/health.go","file.line":65},"message":"BETA: The my_module health metricset is beta.","service.name":"metricbeat","ecs.version":"1.6.0"}
my_module...
health...
error... ERROR timeout waiting for an event
:robot: GitHub comments
Expand to view the GitHub comments
Just comment with:
rundocs-build: Re-trigger the docs validation. (use unformatted text in the comment!)
This pull request does not have a backport label. If this is a bug or security fix, could you label this PR @DanH-Semplicity? 🙏. For such, you'll need to label your PR with:
- The upcoming major version of the Elastic Stack
- The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)
To fixup this pull request, you need to add the backport labels for the needed branches, such as:
backport-8./dis the label to automatically backport to the8./dbranch./dis the digitbackport-active-allis the label that automatically backports to all active branches.backport-active-8is the label that automatically backports to all active minor branches for the 8 major.backport-active-9is the label that automatically backports to all active minor branches for the 9 major.
It would be nice if this was back ported to 8.17, we first encountered the error on 8.17.
@DanH-Semplicity, thanks for the PR!
Indeed it makes the test timeout configurable, however I'm not sure using the HTTP timeout as the test timeout is the best option here. Aside from the double meaning for the timeout configuration key, it also prevents the HTTP timeout and the test timeout to be set at the same time and as different values.
I believe the best option is to introduce a new configuration option, however before changing anything on your PR, let me get more feed back from the team.
Hi @belimawr,
Interesting, I was thinking of testing the http case timeout, where existing time-outs and defaults were set and should be honored. However I think your indicating a more generic module testing scenerio where perhaps it should be passed as parameter to the test command where it can be configured on test modules invocation (and I would suggest defaults to 10 seconds). https://www.elastic.co/docs/reference/beats/metricbeat/command-line-options#test-command. Let me know what your team comes up with. Either sounds fine to me, but I agree your proposed resolution might be better for non http scenario, since I believe timeout only applies to http and not standard config options.
Thanks Dan
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
Hi @belimawr,
Interesting, I was thinking of testing the http case timeout, where existing time-outs and defaults were set and should be honored. However I think your indicating a more generic module testing scenerio where perhaps it should be passed as parameter to the test command where it can be configured on test modules invocation (and I would suggest defaults to 10 seconds). https://www.elastic.co/docs/reference/beats/metricbeat/command-line-options#test-command. Let me know what your team comes up with. Either sounds fine to me, but I agree your proposed resolution might be better for non http scenario, since I believe timeout only applies to http and not standard config options.
Thanks Dan
Based on where this timeout is in the code, any module is bound by it, not only the ones that make HTTP requests, hence the more generic approach.
@rdner suggestion, even though is more complex, is very interesting and allows for easily setting the timeout via CLI.
@belimawr and @rdner, I think @rdner purposed solution is a better overall implementation, due to solving a timer being passed as a parm for all modules (instead of http only). Do you want me to generate the change on this PR or open separate PR to fix? Thanks Dan
@belimawr and @rdner, I think @rdner purposed solution is a better overall implementation, due to solving a timer being passed as a parm for all modules (instead of http only). Do you want me to generate the change on this PR or open separate PR to fix? Thanks Dan
On this PR is fine.