[Feat]: cross node alerts
Problem
I have two nodes who are not streaming to the same parent. I would like to create an alert that works over both nodes to make sure that the ratio of their cpu usage stays broadly similar for some apps running common to both.
The use case here is that node A runs that latest version of some software and node B runs last stable version. Both nodes are running same test workloads.
Description
I would like a way to define an alert on node A or B that can compute this cpu ratio. So basically a way for a node to reachout and get a remote value from another agent as part of the health logic.
Importance
nice to have
Value proposition
- ability to define "cross node" health alerts.
Proposed implementation
No idea. Maybe this could be done in netdata cloud or perhaps the agent could do it to but with some limitations around fequeancy of the alert etc maybe to reduce network overhead and things like that.
i guess maybe i could use the Prometheus collector perhaps to the specific metric i want from node B onto node A (or to pull both node A and B metric to node C and then do the alert on node C if i wanted to be truly doing like with like tests.)
I believe this is part of the work you were planning to work on the alerts domain, right @ktsaou ?