bugbug
bugbug copied to clipboard
Quality report mail: Have a way to drill down on skipped tests
The mail contains the interesting information:
There are 387 tests skipped in some configurations (lower than the median across other teams, 398). They are increasing from 386 you had two weeks ago.
It would be great to have a link to a list of those tests or any other way to drill down here. I am not sure how easy this would be, though, but if we can count them we probably can list them, too?
I wonder if this kind of data already exists in some dashboard that we can simply link to.
CC @ahal @Archaeopteryx @jmaher @ahal
I am not aware of anything. The problem is interesting as we have so many variations we can be skipped on. Some edge cases which can make this harder to solve:
- we don't support the test on android,etc. due to the OS not supporting features, so it is marked with
skip-if
- it might be
skip-if = verify
, does that really count? - it might be
skip-if = os == win7
, but we don't run this test on win7 (or whatever OS it is skipped on) - when you add variants, some tests can run 80 times on a given push, skipping 1 specific config counts as skipped whereas another test can say
skip-if = linux
which could be skipping up to 30 configs. - visualizing all of the configs + variants is difficult
I would like to propose something more interesting- maybe what we ship on !debug for linux64/windows/mac/android
- that is where we have the highest risk of missing coverage and letting regressions through. I would be ok including debug, but asan, tsan, ccov, mingw32, and all the variants- we could provide that information in addition, but we need some way to really measure the risk.
An alternative I can think of is: "test runs on X tier1/2 configs in CI (not with --full on try), it is skipped on Y of them", then you have a 1-y/x=%skipped, which can be used to summarize all tests and put them in buckets, i.e. 389 tests, 5 are skipped <5% of configs, 2 are skipped 5-10%, and 3 are skipped >10% (or disabled)
we still have issues outlined in the numbers above, but that gives everyone more actionable results.