results-collection icon indicating copy to clipboard operation
results-collection copied to clipboard

Reject incomplete result sets

Open jugglinmike opened this issue 7 years ago • 5 comments

Due to infrastructural problems which are still under investigation, this project regularly produces datasets which omit test results. Occasionally, the omission has been quite severe--gh-234 describes one such example.

Incomplete datasets should not be tolerated.

Depends on gh-465.

jugglinmike avatar Feb 15 '18 17:02 jugglinmike

edge 15 windows 10 @efa3c75747 is notably incomplete.

gsnedders avatar Mar 12 '18 15:03 gsnedders

@jugglinmike what is the status of this? We could have this be entirely a matter for the results receiver (https://github.com/web-platform-tests/wpt.fyi/issues/55) to validate, but perhaps you still want to have some check in place to retry?

foolip avatar Apr 17 '18 10:04 foolip

For a month or more, we've known that the "Edge/Windows/Sauce Labs" browser configuration consistently fails to report complete results. Specifically, the browser would crash, the WPT CLI would not respond by re-starting, and we would fail to collect results for all tests that had yet to be run in the affected segment (specifically, this concerned the 19th and 52nd segments when the test suite was partitioned into 100 segments).

I offered a solution to the WPT CLI, but it was not accepted due to concerns for its performance implications. @jgraham followed up with an alternate solution which was accepted.

The Edge build triggered on April 13 ran against revision c53d084cc57749bc666e42aaceeb34daa42539c7 of WPT. This revision predated the fix, and the data set demonstrates the omission.

The Edge build triggered on April 15 ran against revision of WPT. That happens to be the very same revision that introduced the fix, and the data set is complete.

Long story short: Edge continues to crash, but that no longer interferes with report collection. I've submitted gh-541 as a next step towards resolving this issue. Would you mind taking a look, @foolip?

jugglinmike avatar Apr 17 '18 17:04 jugglinmike

With commit 84486a4fbf98e936618e1ff5e6ce808c79bbf877, I re-introduced the "retry" heuristic that we initially implemented in the previous infrastructure. I'm hopeful that this will allow us to recover from intermittent errors that previously impacted completeness.

In light of that (and following promising results from gh- ), that commit also configures the system to reject incomplete results. This is a more aggressive step towards the resolution of this issue.

Already, it has identified one case where tests are being skipped deterministically:

  • "chunk" number: 43 of 100
  • browser: Safari 11.0
  • WPT revision:43dd25c888da9b903cedeb3bb5b8a832b2d51b97

I intend to research the cause of this issue later today.

jugglinmike avatar Apr 30 '18 16:04 jugglinmike

@jugglinmike this is done now, right?

foolip avatar Sep 14 '18 12:09 foolip