wpt-metadata Porting TestExpectations to WPT Metadata

Per Dave Tapuska's suggestions, WPT Metadata should be a subset of TestExpectations, but with a focus on chrome-specific test failures. The triage information in TestExpectations should be ported continuously to WPT Metadata.

For now, we can write a ad-hoc script to port the existing triage information to WPT metadata and think of an automated way going forward (e.g. through a bot).

Jun 04 '20 20:06 KyleJu

The portable expectations in TestExpectations should meet the following criteria:

There exists a bug number;
Tags should be [], [ Linux ], [ Release ] or any combinations of these three;
Test name should start with external/wpt/ (exclude expectations for runtime flags);
Results is not [ Skip ] (maybe ideally [ Failure ] only? Exclude flaky tests);
The same expectation exists and fails in wpt.fyi (due to orphaned expectations, outdated WPT versions or Chrome versions).

Jul 07 '20 01:07 KyleJu

According to failing-tests, TestExpectations records tests that cannot be rebaselined. I suspect the Chrome-specific failures can go unnoticed during the import process

Jul 07 '20 02:07 KyleJu

Results is not [ Skip ] (maybe ideally [ Failure ] only? Exclude flaky tests);

[ Timeout ] also seems like it would be useful?

Flakes may also be useful, but are obviously harder to track.

Jul 07 '20 14:07 stephenmcgruer

TestExpectations has been ported to WPT Metadata in https://github.com/web-platform-tests/wpt-metadata/pull/278. The selecting criteria is mentioned in the comment above, but only include [ Failure ] and [ Timeout ]. Flaky tests are not ported at this point.

Per our discussions offline, TestExpectations file only records reference test failures, flaky tests and non-deterministic tests. New (or Chrome-specific) failures could most likely go unnoticed during the WPT import process as they are reabaselined automatically. As a result, WPT Metadata isn't a subset of TestExpectations, and TestExpectations is not the single source of truth for Chrome test failures.

Going forward, we should figure out a way to port candidates from TestExpectations to WPT Metadata continuously, e,g, a bot.

Jul 13 '20 22:07 KyleJu

A list of portable NeverFixTests candidates have been identified using similar criteria. However, none of them are Chrome-specific failures. I will circle back to this issue when we expand our data ingestion to include all Chrome failures

Aug 17 '20 00:08 KyleJu

A list of portable NeverFixTests candidates have been identified using similar criteria. However, none of them are Chrome-specific failures. I will circle back to this issue when we expand our data ingestion to include all Chrome failures

@stephenmcgruer FYI since this issue is raised by the Layout team. Happy to prioritize it if necessary

Aug 17 '20 16:08 KyleJu

(Why is this issue in wpt.fyi not wpt-metadata?)

I've been playing again with importing TestExpectations via https://github.com/web-platform-tests/wpt-metadata/pull/473, after questions from folks who reasonably don't want to retriage tests they have already marked in TestExpectations. From that PR, I have uploaded data for all of css/css-* as PRs:

https://github.com/web-platform-tests/wpt-metadata/pull/475
https://github.com/web-platform-tests/wpt-metadata/pull/476
https://github.com/web-platform-tests/wpt-metadata/pull/477
https://github.com/web-platform-tests/wpt-metadata/pull/478
https://github.com/web-platform-tests/wpt-metadata/pull/479
https://github.com/web-platform-tests/wpt-metadata/pull/480

I then went through the newly linked bugs for problematic cases. Here are the general problems I found:

Tests linked to generic 'import this WPT directory' bugs, which are against what wpt.fyi triage was trying to achieve (useful triage data).
Tests linked to bugs that are run_web_tests.py-specific (e.g. lack of fuzzy-reftest support)
Tests generally linked to bugs that are marked Fixed (or duplicates of Fixed bugs), despite the fact that test failures are still linked to them.
- In a few cases this turns out to be deliberate, though arguably the test should be in NeverFixTests then.

Also, as a final note, if we do make a regular thing of importing from TestExpectations, we will likely also need to answer:

How to deal with 'retriages', aka when a test failure changes crbug in TestExpectations (or is removed entirely!). We don't look at the Chromium commit diffs here, just the file, so we don't know when this happened.
- Perhaps we could mark entries as 'came from TestExpectations' and then remove them if they no longer exist there? Hacky!
How to deal with tests that pass on wpt.fyi but are in TestExpectations. So far we've just been ignoring that, maybe its fine to do so.

Thoughts on how we might resolve these welcome; the answer may be that we need to cleanup TestExpectations first.

Oct 15 '20 21:10 stephenmcgruer

(Why is this issue in wpt.fyi not wpt-metadata?)

Good point. I've moved the issue to wpt-metadata. (Old issue links should continue to work.)

Oct 15 '20 21:10 Hexcles