arcade icon indicating copy to clipboard operation
arcade copied to clipboard

Staging - [Alerting] Test Reporting Services Monitoring alert

Open dotnet-eng-status-staging[bot] opened this issue 3 years ago • 28 comments

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

@dotnet/dnceng, please investigate

Automation information below, do not change

Grafana-Automated-Alert-Id-fcb554ece9d8452a98c1e18d3ffb8fe8

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 2

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

@AlitzelMendez @missymessa can we close this issue?

michellemcdaniel avatar Sep 06 '22 17:09 michellemcdaniel

Oh.. that's a lot of exceptions...

missymessa avatar Sep 06 '22 17:09 missymessa

Looks like a ton of generic: "An error occurred while sending the request."

The only thing that looks interesting is: "Kusto client timed-out when sending a request to the service. "

missymessa avatar Sep 06 '22 17:09 missymessa

I'm curious, what's a good query to start with?

garath avatar Sep 06 '22 18:09 garath

I'm curious, what's a good query to start with?

Good query for what?

missymessa avatar Sep 06 '22 18:09 missymessa

Filed an IcM with Kusto per the instructions on the error we got from them in the logs :)

https://portal.microsofticm.com/imp/v3/incidents/details/333064319/home

missymessa avatar Sep 06 '22 18:09 missymessa

@garath I'm assuming you want to know what I did to find the errors? I generally look at this panel to see what's going on with the service that was erroring: https://dotnet-eng-grafana-staging.westus2.cloudapp.azure.com/d/buildAnalysis/build-analysis?viewPanel=51

missymessa avatar Sep 06 '22 18:09 missymessa

Kusto team closed the IcM I created with this feedback:

Cluster is healthy, this query has been executed for an hour hence the query has been timeout, please review your query and adjust it accordingly.

Query best practices - Azure Data Explorer | Microsoft Docs

So, we need to investigate what caused that query to run for so long.

missymessa avatar Sep 07 '22 16:09 missymessa

@missymessa should this go to the DevWF epic then?

garath avatar Sep 08 '22 19:09 garath

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • TestDataAggregationService Exceptions 1
  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule

:broken_heart: Metric state changed to alerting

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

  • AzureDevOpsTestAggregation Exceptions 1

Metric Graph

Go to rule

:green_heart: Metric state changed to ok

Hey FR! The Dev WF team should be on top of these alerts until we hand it off.

(When we have engineering documentation, we should add it here, until then, contact one of the v-team members for the Dev WF epic to help investigate).

Metric Graph

Go to rule