karma icon indicating copy to clipboard operation
karma copied to clipboard

Strange behaviour of silences.expired

Open ngc104 opened this issue 3 years ago • 7 comments

From the ChangeLog v0.94 :

Unsilenced alerts will now show recently expired silences if they are old enough. By default silences expired in the last 10 minutes will be shown, this can be configured by setting silences:expired option or --silences.expired flag. Setting this value to 5m will show silences if they expired in the last 5 minutes but only if the alert started firing at least 5 minutes ago.

Let's use the default config of silences.expired=10m.

10h : an alert fires 10h06 : I set a silence of 1 minute 10h07 : the silence expires

In this case I expect to see the alert re-appear with the expired silence with a label expired a few seconds ago.

The doc explains : Setting this value to 5m will show silences if they expired in the last 5 minutes but only if the alert started firing at least 5 minutes ago.. In my case, it will show only if the alert started firing at least 10 minutes ago, as far as I understand the doc.

Question/bug 1: why should the alert be older than the silences.expired duration ?

Let's continue...

10h11 : I set a new silence of 1 minute 10h12 : the silence expires

I can see now the 2 silences with the labels expired a few seconds ago and expired 5 minutes ago.

Question/bug 2: At 10h07 I could not see the 1st silence. Why can I see it now (I can see the 2 silences now) ?

ngc104 avatar Jan 04 '22 14:01 ngc104

Question/bug 1: why should the alert be older than the silences.expired duration ?

Because the goal is to tell you why you suddenly see an alert, and so that you know there's likely no point debugging it as this is an old issue that was silenced but the silence just expired. Duration requirement exists to only show the silence where it's relevant - when it's something you likely don't know about, rather than show you all silences ever linked to given alert. If both alert and silence are recent then you're likely to be the one silencing and so you know about it.

Question/bug 2: At 10h07 I could not see the 1st silence. Why can I see it now (I can see the 2 silences now) ?

Because at 10h07 alert was only 7 minutes old, and that's more recent then 10 minutes required by silences.expired=10m

prymitive avatar Jan 04 '22 15:01 prymitive

Thanks for your answers. Let me give you the real use cases

Question/bug 1

In the Production environment, we are not watching Karma all the time. We have other alerting systems. But Karma is the best tool to have an overview of all the alerts.

In this context, we would not set silences.expired to 10 minutes because we would miss that label nearly all the time. Setting 24h or even 72h (for week-ends) is more relevant.

Day 1... 18h00 : an alert fires 18h20 : I set a silence of 1 hour (and I try to fix the problem) 19h00 : the alert still fires and I decided that I could go home, the problem can be fixed tomorrow. I forgot to add a new silence for the night. 19h20 : the alert comes visible again (the silence expired)

For what I understand, the label "expired" will show only at 18h20 the day after (day 2).

In that case, that label is really needed at 7 am when colleagues come to work and will see that alert.

Question/bug 2

Back to the previous example (theoric example) : if you decide to not show the label at 10h10, for what reason would you show it at 10h12 ?

Transposition in the real-case example : At 10am the day after, I see my alert still fires and I set a new silence for 12h (will expire at 10pm) At 7am day 3, colleagues will see that alert again, but with the labels of the 2 silences. Why was the first label hidden at day 2 and why is it showing at day 3 ?

ngc104 avatar Jan 06 '22 10:01 ngc104

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 07 '22 19:03 github-actions[bot]

Hello,

Any update on this ?

ngc104 avatar Mar 07 '22 20:03 ngc104

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar May 07 '22 19:05 github-actions[bot]

Hello,

I was thinking... from Changelog v0.100 :

silences:expired option no longer takes alerts age into account.

Does that mean this issue has become N/A ?

I have not had time to test yet.

ngc104 avatar May 10 '22 06:05 ngc104

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 10 '22 19:07 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 14 '22 19:09 github-actions[bot]

The issue became N/A since a while, probably since v0.100. It can be closed.

ngc104 avatar Sep 19 '22 08:09 ngc104