arcade icon indicating copy to clipboard operation
arcade copied to clipboard

Production - [Alerting] Apple simulator failure rate alert

Open dotnet-eng-status[bot] opened this issue 2 years ago • 7 comments

:broken_heart: Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=dci-mac-build-057} 83

Go to rule

@dotnet/dnceng, please investigate

Automation information below, do not change

Grafana-Automated-Alert-Id-36d07fceeaf0472b804d8358b2198eac

dotnet-eng-status[bot] avatar Sep 12 '22 22:09 dotnet-eng-status[bot]

:broken_heart: Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=dci-mac-build-045} 100
  • FailureRate {Machine=dci-mac-build-054} 100

Go to rule

dotnet-eng-status[bot] avatar Sep 13 '22 10:09 dotnet-eng-status[bot]

After discussion with @premun, decided to exclude app failure (ExitCode 80) from the alert as its not an infrastructure issue. PR for the change -> https://dev.azure.com/dnceng/internal/_git/dotnet-helix-service/pullrequest/25729

oleksandr-didyk avatar Sep 13 '22 11:09 oleksandr-didyk

:broken_heart: Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=dci-mac-build-008} 100
  • FailureRate {Machine=dci-mac-build-045} 100

Go to rule

dotnet-eng-status[bot] avatar Sep 13 '22 22:09 dotnet-eng-status[bot]

:green_heart: Metric state changed to ok

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

Go to rule

dotnet-eng-status[bot] avatar Sep 14 '22 09:09 dotnet-eng-status[bot]

:broken_heart: Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=dci-mac-build-009} 100

Go to rule

dotnet-eng-status[bot] avatar Sep 17 '22 09:09 dotnet-eng-status[bot]

:broken_heart: Metric state changed to alerting

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

  • FailureRate {Machine=dci-mac-build-009} 100

Go to rule

dotnet-eng-status[bot] avatar Sep 17 '22 21:09 dotnet-eng-status[bot]

:green_heart: Metric state changed to ok

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

Go to rule

dotnet-eng-status[bot] avatar Sep 18 '22 09:09 dotnet-eng-status[bot]

:green_heart: Metric state changed to ok

Description and instructions for this alert

Please note that this alert will fire every 12 hours as the list of machines can change while the alert is alive. So please keep an eye on the list of machines in the comment.

Go to rule

dotnet-eng-status[bot] avatar Sep 29 '22 01:09 dotnet-eng-status[bot]