wpt
wpt copied to clipboard
Missing Chrome and Firefox stable results for July 19, 2022
Hi @foolip and @jgraham, I was checking the ecosystem dashboard and saw that the stable runs circle is red.
Would you happen to have any insight on why the Chrome and Firefox results did not come in for July 19?

Many thanks, James S.
Results came in the following day
@jcscottiii it looks like this has been happening with some regularity looking at https://wpt.fyi/runs?label=master&label=stable&max-count=100&product=chrome&product=firefox&product=safari
Here's how I'd investigate what went wrong. I'll use the runs missing in 6808a6b as the example, as that's more recent. First click "6808a6b" to get to this view: https://wpt.fyi/results/?sha=6808a6b426&label=master&label=stable&max-count=1&product=chrome&product=firefox&product=safari
Then, clicking "6808a6b" under Safari will get you to GitHub: https://github.com/web-platform-tests/wpt/commit/6808a6b426
There's a red x next to "Fix expectations for contain-intrinsic-size-028.html" that you can click to expand which checks passed and failed. Chrome and Firefox are run on Taskcluster, so click the first failing Taskcluster check to get here: https://github.com/web-platform-tests/wpt/runs/7512548164
Clicking through some more gets us to the task group: https://community-tc.services.mozilla.com/tasks/groups/JqrwqVINQQyGb0F3opyVjw
But it looks like Chrome and Firefox all passed, right? The problem is then most likely with the wpt.fyi processor.
Maybe the processor requires all tasks to pass, so a failure of any task prevents processing of results. I don't think this is the case, but it would explain it.
Since this continues to happen, I'll reopen. @jcscottiii do you know where in GCP to find processor logs to dig into what went wrong?
@foolip thanks so much for these steps. I can take a look into the processor's logs in GCP.
Some raw notes from the investigation:
{
insertId: "14iltpfgnk6as"
logName: "projects/wptdashboard/logs/request_log_entries"
receiveTimestamp: "2022-07-26T05:05:49.245405560Z"
resource: {2}
severity: "ERROR"
textPayload: "Failed to fetch check runs for suite 7515393036: GET https://api.github.com/repos/web-platform-tests/wpt/check-suites/7515393036/check-runs?page=7&per_page=25: 502 Server Error []"
timestamp: "2022-07-26T05:05:48.234525276Z"
trace: "projects/wptdashboard/traces/ce97c386a47ff612fd5163f70a13d189"
}
{
insertId: "14iltpfgnk6at"
logName: "projects/wptdashboard/logs/request_log_entries"
receiveTimestamp: "2022-07-26T05:05:49.245405560Z"
resource: {2}
severity: "ERROR"
textPayload: "GET https://api.github.com/repos/web-platform-tests/wpt/check-suites/7515393036/check-runs?page=7&per_page=25: 502 Server Error []"
timestamp: "2022-07-26T05:05:48.234573772Z"
trace: "projects/wptdashboard/traces/ce97c386a47ff612fd5163f70a13d189"
}
That corresponds to this
https://github.com/web-platform-tests/wpt.fyi/blob/2bb8884902f15c98bbe4a431e6db22ad7eeb4159/api/taskcluster/webhook.go#L133-L137
runs, err := api.ListCheckRuns(owner, repo, checkSuite.GetCheckSuite().GetID())
if err != nil {
log.Errorf("Failed to fetch check runs for suite %v: %s", checkSuite.GetCheckSuite().GetID(), err.Error())
return EventInfo{}, err
}
which is called from here:
event, err = GetCheckSuiteEventInfo(checkSuite, log, api)
}
if err != nil {
log.Errorf("%v", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
Also did an analysis for today since it seems missing too 6808a6b
same error:
{
insertId: "968z1tfcgdau0"
logName: "projects/wptdashboard/logs/request_log_entries"
receiveTimestamp: "2022-08-01T05:31:22.118640003Z"
resource: {2}
severity: "ERROR"
textPayload: "Failed to fetch check runs for suite 7600616016: GET https://api.github.com/repos/web-platform-tests/wpt/check-suites/7600616016/check-runs?page=4&per_page=25: 502 Server Error []"
timestamp: "2022-08-01T05:31:21.911628683Z"
trace: "projects/wptdashboard/traces/8bafec85f590002ba7f0e54037f32549"
}
{
insertId: "968z1tfcgdau1"
logName: "projects/wptdashboard/logs/request_log_entries"
receiveTimestamp: "2022-08-01T05:31:22.118640003Z"
resource: {2}
severity: "ERROR"
textPayload: "GET https://api.github.com/repos/web-platform-tests/wpt/check-suites/7600616016/check-runs?page=4&per_page=25: 502 Server Error []"
timestamp: "2022-08-01T05:31:21.911653794Z"
trace: "projects/wptdashboard/traces/8bafec85f590002ba7f0e54037f32549"
}
Initial Diagnosis
Started tracing to when the GitHub webhook is called for taskcluster.
Looks like there is an intermittent problem when calling the GitHub API. Still need to find out why it stops both browsers from uploading. It may be like you said @foolip . But need to confirm