agent
agent copied to clipboard
Grafana Agent Operator - Support for the same features as Prometheus Operator
Following discussion with a number of users, it has been found that the Prometheus Operator and the Grafana Agent Operator are slightly out of feature alignment. The Prometheus Operator supports Alerts and Alerting, which the Grafana Agent Operator does not. This leads to a proliferation of tools required for maintenance of AM and Alerts outside of the operator.
This proposal is to request the same level of support for Alerting in the Grafana Agent Operator, as is supported in the Prometheus Operator. Ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/alerting.md
This is similar to #523 where we discussed adding the ability to Grafana Agent to sync rules with the Cortex/Mimir ruler API.
It sounds reasonable to add that functionality into both the agent and the operator, though it's not going to be simple; we need to figure out how to identify rules that came from an agent so we know which should be added/removed/changed when reconciling the list.
Prometheus Operator seems to store this info in the Prometheus
CRs, at spec.{rule{,Namespace}Selector}
.
While using different CRDs, VictoriaMetrics operator seems to have a alerts-specific VMAlert.spec.{rule{,Namespace}Selector}
that describes where to select *Rule
CRs from.
I assume to keep this as frictionless as possible, no new CRDs should be introduced, but Grafana Agent Operator could listen to Prometheus
CRs?
Or, looking at https://github.com/grafana/agent/pull/1839 and similar proposals, Grafana Agent would watch PrometheusRule
CRs, and filters for selectors/namespaces would be a config of Grafana Agent itself?
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed in 7 days if there is no new activity. Thank you for your contributions!
remove stale
This would be a boon if this support existed in the agent.
Hey all, support for recording and alerting rules is something still being considered. We don't have any updates right now.
With the recent Grafana Agent Flow announcement, it might make sense to support recording/alerting rules as Flow components, too.
Awesome to hear. The flow features are cool, but the use case is to allow the rules to ship along side individual application as manifests vs. requiring a more central cluster-level configuration. :crossed_fingers: for this feature parity ticket. It would also make it very easy for people to transition to these products from their current state if using the prometheus operator. Your sales team would love it :wink:
Sorry, if I wasn't clear, I meant we could have Flow components which could discover and consume alert/monitoring rule CRDs :) That would support the use case you just mentioned; provide PrometheusRule resources alongside applications and have Flow discover and act on them.
With https://github.com/grafana/agent/issues/1544 complete via https://github.com/grafana/agent/pull/2604/ this is now missing a piece in grafana agent operator to make use of the new feature.