Alert specific for each permutation of a sentry tag
Problem Statement
In Sentry alerts, I can set an individual alert based on filters such as tag, message, etc. However, I want to set up an alarm that essentially says "alert me if issue X happens for any respective tag value more than 10 times in 5 minutes".
To help illustrate, an example might be that you send a sentry warning when a rate limit 429 occurs. For this error, you have a sentry tag for the customer name. Let's say I have customer A, B, & C. I want to be alerted if customer A is rate limited more than 10 times in 5 minutes. Same, respectively, for B & C.
Right now in sentry, I would have to go make 3 different alerts for each enumeration of customer tag. This does not scale well.
Solution Brainstorm
No response
Product Area
Alerts
Assigning to @getsentry/support for routing ⏲️
Routing to @getsentry/product-owners-user-feedback for triage ⏲️
Routing to @getsentry/product-owners-alerts for triage ⏲️
"alert me if issue X happens for any respective tag value more than 10 times in 5 minutes"
Do you know the tag values that you want to be alerted on? Assuming yes from your example of customers A, B, and C, you can add three conditions to the alert with "IF any of these filters match".
"alert me if issue X happens for any respective tag value more than 10 times in 5 minutes"
Do you know the tag values that you want to be alerted on? Assuming yes from your example of customers A, B, and C, you can add three conditions to the alert with "IF any of these filters match".
Let's say I have >1K customers, the filter does not scale as well and needs a change when each new customer comes along. Also, the idea here is not summing them all up, because major % changes in an error for any one customer might get lost in the aggregated number.
Let's say in the case of customer A, B, and C that I do the "IF any of these filters match" and make the trigger limit 30 (3x customers for "more than 10 times in 5 minutes").
If customers A & B are at 0 rate limits, then customer C would have to get >=30 errors in 10 minutes. This is different than my goal which is 10 for any customer. On the flipside if I decided to keep it at 10 for the sum of all customers to make sure something doesn't slip for customer C (or D, E, etc), then the alert is likely going to be overly noisy + the customer (or tag) that is causing the spike would not be readily available.
@ale-cota is this use case more suited for developer metrics? 🤔
Hello, following up here as it is coming up for another use case regarding being proactive when a customer has a misconfiguration in our ecosystem via warnings in sentry.
Hi @david-goodfellow - can you share more about the expected cardinality of the data that you have in mind? We've added your feature request to our backlog, and will let you know when we're able to prioritize it. Thanks!
The cardinality is fairly low (<10 logs every minute). The distinct count of unique values that the tag could have is about 200
Internal customer case https://www.notion.so/sentry/2118b10e4b5d80b29248cac381c60a26