Handle flaky test returning to previously approved stable failure
Today, a consistently failing test that rarely times out causes redness when it leaves the flaky state.
Example:
- test
Anormally produces aRuntimeError. - Because of test framework issues, sometimes it will time out.
- Sometimes the deflaking will produce consistent time outs and the test causes redness directly.
- Sometimes the deflaking will detect the test as flaky and it will be marked flaky.
- After not producing a time out for 100 runs it will be marked as
RuntimeErroragain, which causes redness.
The proposal is to handle the redness caused by 3) and 5) somehow. 3) is likely temporary and persists for only a single build. 5) is permanent because this is the steady state. So perhaps, if a test goes from a flaky or timeout to a previously approved state, that should re-apply the previous approval.
Real world examples: https://dart-ci.firebaseapp.com/#showLatestFailures=false&test=co19_2/Language/Expressions/Constants/integer_size_t04&configurations=dart2js-hostasserts-linux-ia32-d8 https://dart-ci.firebaseapp.com/#showLatestFailures=false&test=corelib_2/integer_parsed_mul_div_vm_test&configurations=dart2js-hostasserts-linux-ia32-d8
Consistent timeouts in compilation: https://dart-ci.appspot.com/log/dart2js-strong-hostasserts-linux-ia32-d8/dart2js-hostasserts-linux-ia32-d8/12360/corelib_2/integer_parsed_mul_div_vm_test
Consistent RTE (trying to use int64 semantics on the web): https://dart-ci.appspot.com/log/dart2js-strong-hostasserts-linux-ia32-d8/dart2js-hostasserts-linux-ia32-d8/12361/corelib_2/integer_parsed_mul_div_vm_test