wpt-metadata
wpt-metadata copied to clipboard
Programmatic mapping of Chrome failures to existing monorail bugs
In https://github.com/web-platform-tests/wpt-metadata/issues/481 one of the programmatic imports we did was from a set of chrome specific failures on wpt.fyi, matching them against existing monorail bugs based on searching filenames.
It makes sense to do this for all Chrome failures, not just Chrome-specific failures, so let's do that! I'm going to use this issue to track it :).
Initial request none(triaged:chrome) chrome:!pass chrome:!ok:
curl 'https://wpt.fyi/api/search' \
-H 'authority: wpt.fyi' \
-H 'user-agent: Mozilla/5.0 (X11; CrOS x86_64 13505.111.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.152 Safari/537.36' \
-H 'content-type: text/plain;charset=UTF-8' \
-H 'accept: */*' \
-H 'origin: https://wpt.fyi' \
-H 'sec-fetch-site: same-origin' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-dest: empty' \
-H 'referer: https://wpt.fyi/results/?label=master&label=experimental&aligned&q=none%28triaged%3Achrome%29%20chrome%3A%21pass%20chrome%3A%21ok' \
-H 'accept-language: en-US,en;q=0.9,en-CA;q=0.8' \
-H 'cookie: _ga=GA1.2.1092915640.1598030877; session=MTYxMTI0MjQ5NnxaRmFBcXJ5enh4djdPZDdCTXpqT3FiMHJJZ0xHcHNEN25JQWlzRi0xSjlTRWUyMi1XQ2I5MG1vRWRDQmw5OFVfcmxsTHliaTVaYTBzcFE4NWZTWWc0NFRvMk9Qc0twY2Y0UXZaeldPWlBLT1ZkZThjVmlLUE1XWWk4M0wtRWlkdW43M2xubU92X0RUeGx3ZWpkN21PY0hJUlVKc3huaHlrNmQzSzJERGJHblRpSHA1VGhUY0hvck9CWXdfcTNpcXlBWW04Z2dOM3M1V2NDNmprUldPSjFkSFFCMG82UHJZbGdwcEdQRlFjdlhDa2tINFRpeWlweFpra1ZBWDV2WnBIRDlqaGhUYzZ4Wm5mc1E9PXw7TL71UAi2DVHJma8h7VTzNBJCJXlXxUaX3e1iIKYj3A==; _gid=GA1.2.1974819826.1611662905; _gat=1' \
--data-binary '{"run_ids":[5734059885985792,4806348191563776,5731688392949760,5701194796236800],"query":{"and":[{"none":[{"triaged":"chrome"}]},{"exists":[{"product":"chrome","status":{"not":"PASS"}}]},{"exists":[{"product":"chrome","status":{"not":"OK"}}]}]}}' \
--compressed
EDIT: In retrospect, I should also have specified chrome:!missing.
Flattened the results.json into a list of tests:
import json
with open('results.json', 'r') as f:
results = json.load(f)
tests = results['results']
with open('tests.txt', 'w') as f:
for test in tests:
f.write(test['test'])
f.write('\n')
And sorted it
sort -o tests.txt tests.txt
And then a blinkpy script to search monorail:
import sys
from blinkpy.w3c.monorail import MonorailAPI, MonorailIssue
from blinkpy.common.net.luci_auth import LuciAuth
from blinkpy.common.host import Host
import googleapiclient
host = Host()
token = LuciAuth(host).get_access_token()
api = MonorailAPI(access_token=token)
# A cache in case the runs break halfway. They did.
processed_tests = set()
with open('processed-tests.txt', 'r') as f:
for line in f:
line = line.strip()
if 'ERRORED' in line:
continue
processed_tests.add(line.split(' ')[0])
with open('tests.txt', 'r') as f:
tests = [line.strip() for line in f]
issues = api.api.issues()
def log(msg):
print(msg)
sys.stdout.flush()
log("Processing %s tests" % len(tests))
for test in tests:
if test in processed_tests:
continue
try:
resp = issues.list(projectId='chromium', q=test, can='open').execute()
bug_ids = map(lambda x : str(x['id']), resp['items'] if resp['totalResults'] > 0 else [])
log("%s => [%s]" % (test, ','.join(bug_ids)))
except googleapiclient.errors.HttpError:
log("%s ERRORED" % test)
And the results: processed-tests.txt
(Note: did a pass through processed-tests.txt and removed 626703 as that is a known meta-bug of zero value)
Next step, turn this into a series of wpt-metadata PRs (using the same golang script as before), and let @foolip sort through which are junk and which are useful ;)
Golang script for updating wpt-metadata: https://gist.github.com/stephenmcgruer/0b84c426f2840003c542bcb25740a9d8
@foolip - I've sent you a few PRs now; consider them a sample of the output of this methodology. I'd like those reviewed first to see if the rest of the data is worth uploading or not (the PRs so far cover ~30% of the ~800 total tests that fail in wpt.fyi and have exactly one bug when you search their test path in monorail).
https://github.com/web-platform-tests/wpt-metadata/pull/804#issuecomment-770963731 has some bits useful for reviewing these PRs:
bug-titles.txt (faux-CSV file; split on the first ',' only).
Also as a spreadsheet: https://docs.google.com/spreadsheets/d/1AJFl3gLfVFjOXRAir9g2BsesdF3fGp9LJHnDtWQLW-c/edit#gid=0