dart_ci icon indicating copy to clipboard operation
dart_ci copied to clipboard

Handle flaky test returning to previously approved stable failure

Open athomas opened this issue 5 years ago • 0 comments

Today, a consistently failing test that rarely times out causes redness when it leaves the flaky state.

Example:

  1. test A normally produces a RuntimeError.
  2. Because of test framework issues, sometimes it will time out.
  3. Sometimes the deflaking will produce consistent time outs and the test causes redness directly.
  4. Sometimes the deflaking will detect the test as flaky and it will be marked flaky.
  5. After not producing a time out for 100 runs it will be marked as RuntimeError again, which causes redness.

The proposal is to handle the redness caused by 3) and 5) somehow. 3) is likely temporary and persists for only a single build. 5) is permanent because this is the steady state. So perhaps, if a test goes from a flaky or timeout to a previously approved state, that should re-apply the previous approval.

Real world examples: https://dart-ci.firebaseapp.com/#showLatestFailures=false&test=co19_2/Language/Expressions/Constants/integer_size_t04&configurations=dart2js-hostasserts-linux-ia32-d8 https://dart-ci.firebaseapp.com/#showLatestFailures=false&test=corelib_2/integer_parsed_mul_div_vm_test&configurations=dart2js-hostasserts-linux-ia32-d8

Consistent timeouts in compilation: https://dart-ci.appspot.com/log/dart2js-strong-hostasserts-linux-ia32-d8/dart2js-hostasserts-linux-ia32-d8/12360/corelib_2/integer_parsed_mul_div_vm_test

Consistent RTE (trying to use int64 semantics on the web): https://dart-ci.appspot.com/log/dart2js-strong-hostasserts-linux-ia32-d8/dart2js-hostasserts-linux-ia32-d8/12361/corelib_2/integer_parsed_mul_div_vm_test

athomas avatar Jan 27 '21 10:01 athomas