cortex
cortex copied to clipboard
[ruler] Allow Federated rules
Currently ruler rules are tied to a single tenant, but there are use cases where it is useful to evaluate a rule against multiple tenants.
Rather than write individual rules against each tenant, it would be useful to be able to create rules that span multiple tenants similar to the query front end's ability to query multiple tenants. What would the preferred way be to make this configurable?
I took an initial pass at this in https://github.com/rdooley/cortex/tree/multitenant-ruler building off some work @andrejbranch did in https://github.com/cortexproject/cortex/compare/master...andrejbranch:querier-multi-meta and would love some feedback
Note we have always described Cortex' ability to handle separate data for each tenant as "multitenant"; we used the word "federated" when talking about querying more than one at a time.
Updated the title to align with expected nomenclature, ty @bboreham
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.
There is some ongoing discussion on the proposal, and this feature is still desired
Updating the title of this issue to reflect the state of the merged federated ruler proposal.
Hi @rdooley thanks for this work! Since your PR would add a lot of value for us, do you know more or less how far would that be from completion now that the proposal is officially accepted? Is there any way we can help with the implementation/testing or anything else?
My previous draft PR isn't really worth going thru the required large merge process to bring up to date, as it reflected an earlier state of the proposal. Due to some changes in work priorities, I'm not sure how much time I'll have in the immediate future to work on an implementation of the accepted proposal. I can link how I did a similar thing before by modifying the OrgID on the relevant contexts, but other than that I'm not much help at the moment.
Hi @rdooley re "I can link how I did a similar thing": could you share that link? Thanks!
https://github.com/cortexproject/cortex/pull/4520/files/0df05ce5f2a196174dad6fe9eb256c424e37e862#diff-016949dfe9193918272ee1387f5faba95d1216f13d8093c3b7f8c672e527f66eR589 In the manager eval I checked if rules had srcTenant specified, if they did inject the srcTenants orgID into a new context before passing that to rule.eval
Thanks @rdooley !
No worries, sorry i cannot be more help right now.
One issue in that draft is that I was forced to vendor within cortex a lot of prometheus code to add fields to rules (which instead now should be on the rule group given the proposal). This obviously will incur a maintenance cost within cortex that rightly will get some pushback. You may be able to accomplish a similar goal without doing this by using a standard label on the rule group or some other flexible means of extending the prometheus rule group.
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.
Hey folks, what is the status of this issue? This is something that will come in very handy for my use case. We have a lot of tenants which federate a lot of data from other tenants at a Prometheus level, because that helps us to create very detailed alerts.
The problem of course is that a lot of data is being duplicated across the infrastructure and seems like a great waste of resources.
Cortex Ruler Federation seems to me like the best solution because it means that we can switch all our Prometheus instances to Agent mode, move all alerting to Cortex Ruler, and run a much leaner infrastructure as a result.
For alerts one could add federated datasources in Grafana and fire alerts based on those, but it wouldn't work for recording rules.
Hey folks, what is the status of this issue? This is something that will come in very handy for my use case. We have a lot of tenants which federate a lot of data from other tenants at a Prometheus level, because that helps us to create very detailed alerts.
- Proposal is in, so still a thing to do
- I have some links above of how I was previously able to achieve this, with some caveats like vendoring a bunch of prometheus code to add the required fields to rules
- We have moved away from cortex and I have changed teams internally so I unfortunately don't have the bandwidth to work on this
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.