results-collection Reject incomplete result sets

Due to infrastructural problems which are still under investigation, this project regularly produces datasets which omit test results. Occasionally, the omission has been quite severe--gh-234 describes one such example.

Incomplete datasets should not be tolerated.

Depends on gh-465.

Feb 15 '18 17:02 jugglinmike

edge 15 windows 10 @efa3c75747 is notably incomplete.

Mar 12 '18 15:03 gsnedders

@jugglinmike what is the status of this? We could have this be entirely a matter for the results receiver (https://github.com/web-platform-tests/wpt.fyi/issues/55) to validate, but perhaps you still want to have some check in place to retry?

Apr 17 '18 10:04 foolip

For a month or more, we've known that the "Edge/Windows/Sauce Labs" browser configuration consistently fails to report complete results. Specifically, the browser would crash, the WPT CLI would not respond by re-starting, and we would fail to collect results for all tests that had yet to be run in the affected segment (specifically, this concerned the 19th and 52nd segments when the test suite was partitioned into 100 segments).

I offered a solution to the WPT CLI, but it was not accepted due to concerns for its performance implications. @jgraham followed up with an alternate solution which was accepted.

The Edge build triggered on April 13 ran against revision c53d084cc57749bc666e42aaceeb34daa42539c7 of WPT. This revision predated the fix, and the data set demonstrates the omission.

The Edge build triggered on April 15 ran against revision of WPT. That happens to be the very same revision that introduced the fix, and the data set is complete.

Long story short: Edge continues to crash, but that no longer interferes with report collection. I've submitted gh-541 as a next step towards resolving this issue. Would you mind taking a look, @foolip?

Apr 17 '18 17:04 jugglinmike

With commit 84486a4fbf98e936618e1ff5e6ce808c79bbf877, I re-introduced the "retry" heuristic that we initially implemented in the previous infrastructure. I'm hopeful that this will allow us to recover from intermittent errors that previously impacted completeness.

In light of that (and following promising results from gh- ), that commit also configures the system to reject incomplete results. This is a more aggressive step towards the resolution of this issue.

Already, it has identified one case where tests are being skipped deterministically:

"chunk" number: 43 of 100
browser: Safari 11.0
WPT revision:43dd25c888da9b903cedeb3bb5b8a832b2d51b97

I intend to research the cause of this issue later today.

Apr 30 '18 16:04 jugglinmike

@jugglinmike this is done now, right?

Sep 14 '18 12:09 foolip

results-collection results-collection copied to clipboard

Reject incomplete result sets

results-collection
results-collection copied to clipboard