Add machines heartbeat as a prometheus metric
What would you like to be added?
I think it would be useful for the prometheus exporter to also export a metric concerning the heartbeat of the registered machines. As in, for example,
cs_machines_heartbeat_seconds{instance="example.com"} 46
similarly to how the CLI already displays this info.
/kind enhancement
Why is this needed?
It would help with seeing if the infrastructure is still intact when changes happen. For one example, we have a setup where we have to communicate across firewalls. If changes to this firewall happen by which client instances could not connect back to the LAPI, one would not know these effects except if you go looking for it.
@LuminatiHD: Thanks for opening an issue, it is currently awaiting triage.
In the meantime, you can:
- Check Crowdsec Documentation to see if your issue can be self resolved.
- You can also join our Discord.
- Check Releases to make sure your agent is on the latest version.
Details
I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository.
@LuminatiHD: There are no 'kind' label on this issue. You need a 'kind' label to start the triage process.
/kind feature/kind enhancement/kind bug/kind packaging
Details
I am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository.
Hey 👋🏻
Just to ask more questions, each machine has local Prometheus metrics port which can be scraped by Prometheus. If you setup collecting all these instances, wouldnt monitoring already be covered?
However, I do see a point that this does NOT cover if the instance itself cannot connect back to the main instance.
I just wanted to get more information since the feature request was 5 words.
Hey 👋🏻
Just to ask more questions, each machine has local Prometheus metrics port which can be scraped by Prometheus. If you setup collecting all these instances, wouldnt monitoring already be covered?
However, I do see a point that this does NOT cover if the instance itself cannot connect back to the main instance.
I just wanted to get more information since the feature request was 5 words.
No problem. I updated the comment, is this more helpful?
/kind enhancement
Thank you for updating the request with a lot more details.
As stated in your other request enhancement request with the release of v1.6.0 the team has their hands full with other projects.
I have added the "good first issue" tag to indicate pull requests from everyone are welcome to resolve this.
I've just noticed that cs_lapi_machine_requests_total{route="/v1/heartbeat"} already exists. IMO, it's not that nice of a solution, as opposed to a dedicated metric, but I understand if that is enough for closing this issue.