beats
beats copied to clipboard
[beatreceiver] - Add status reporting
Proposed commit message
This PR adds status reporting for beatreceivers. The status reporting is added while creating the runners. The first PR (https://github.com/elastic/beats/pull/44528) was quite "hacky" and it had go deep down to inject status reporters.
This PR adds a runner factory wrapper that will:
- Call the parent factory to create the runner
- Inject status reporter
The code responsible for doing the above tasks will live in libbeat and we will only enable it for beatreceivers. From an the beat receiver high level, it will do following:
- The beater will be created in
createReceiver - We will add the factory wrapper https://github.com/elastic/beats/blob/344bbcefae3c7be4f6f9a7ff0b1e7985caf0823c/x-pack/libbeat/cmd/instance/receiver.go#L80-L83
- The receiver will kick off the beater https://github.com/elastic/beats/blob/62864922d3227e4586ad0b53c9c0dfb213df3f69/x-pack/libbeat/cmd/instance/receiver.go#L76-L81
Note:
To accomplish the above steps, it is essential that we create the runners in beater.Run(...). Currently, metricbeat creates runners during the beater creation phase and starts them in beater.Run(...). This PR moves the runner creation code in beater.Run(...) to closely align with filebeat's implementation.
Checklist
- [x] My code follows the style guidelines of this project
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have made corresponding change to the default configuration files
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] I have added an entry in
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.
Related issues
- Related https://github.com/elastic/elastic-agent/issues/8210
Screenshots
Output
Here's output of running two streams (degraded) together:
┌─ fleet
│ └─ status: (STOPPED) Not enrolled into Fleet
└─ elastic-agent
├─ status: (DEGRADED) 1 or more components/units in a degraded state
├─ pipeline:logs/_agent-component/filestream-default
│ ├─ status: StatusRecoverableError [error while running harvester: cannot read from file source: /var/log/elasticAgent-install-20240625_133733.log]
│ ├─ exporter:elasticsearch/_agent-component/default
│ │ └─ status: StatusOK
│ └─ receiver:filebeatreceiver/_agent-component/filestream-default
│ └─ status: StatusRecoverableError [error while running harvester: cannot read from file source: /var/log/elasticAgent-install-20240625_133733.log]
└─ pipeline:logs/_agent-component/system/metrics-default
├─ status: StatusRecoverableError [Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 607 processes, most likely a "permission denied" error. Enable debug logging to determine the exact cause.]
├─ exporter:elasticsearch/_agent-component/default
│ └─ status: StatusOK
└─ receiver:metricbeatreceiver/_agent-component/system/metrics-default
└─ status: StatusRecoverableError [Error fetching data for metricset system.process: error fetching process list: non fatal error; reporting partial metrics: error fetching PID metrics for 607 processes, most likely a "permission denied" error. Enable debug logging to determine the exact cause.]
Testing
- Checkout this PR locally
- Go to
elastic-agentand follow this guide to test local beats changes - Package agent with
mage package - Follow steps on https://github.com/elastic/elastic-agent/issues/8210 to install agent and verify the status
Closes https://github.com/elastic/elastic-agent/issues/8210
:robot: GitHub comments
Expand to view the GitHub comments
Just comment with:
rundocs-build: Re-trigger the docs validation. (use unformatted text in the comment!)
This pull request does not have a backport label. If this is a bug or security fix, could you label this PR @VihasMakwana? 🙏. For such, you'll need to label your PR with:
- The upcoming major version of the Elastic Stack
- The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)
To fixup this pull request, you need to add the backport labels for the needed branches, such as:
backport-8./dis the label to automatically backport to the8./dbranch./dis the digitbackport-active-allis the label that automatically backports to all active branches.backport-active-8is the label that automatically backports to all active minor branches for the 8 major.backport-active-9is the label that automatically backports to all active minor branches for the 9 major.
Quite an elegant solution for this problem!
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)
This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/
git fetch upstream
git checkout -b wrap-runner-factory upstream/wrap-runner-factory
git merge upstream/main
git push upstream wrap-runner-factory
@VihasMakwana Could you please fix the conflicts? Thank you!
@mauri870 @khushijain21 I've added new test cases and have made changes to benchmark modules for testing. We can now make benchmark module return error if we want, to test status reporting. Please take a look!