bugbug icon indicating copy to clipboard operation
bugbug copied to clipboard

Quality report mail: Have a way to drill down on skipped tests

Open jensstutte opened this issue 3 years ago • 2 comments

The mail contains the interesting information:

There are 387 tests skipped in some configurations (lower than the median across other teams, 398). They are increasing from 386 you had two weeks ago.

It would be great to have a link to a list of those tests or any other way to drill down here. I am not sure how easy this would be, though, but if we can count them we probably can list them, too?

jensstutte avatar Jan 04 '22 10:01 jensstutte

I wonder if this kind of data already exists in some dashboard that we can simply link to.

CC @ahal @Archaeopteryx @jmaher @ahal

marco-c avatar Jan 11 '22 11:01 marco-c

I am not aware of anything. The problem is interesting as we have so many variations we can be skipped on. Some edge cases which can make this harder to solve:

  1. we don't support the test on android,etc. due to the OS not supporting features, so it is marked with skip-if
  2. it might be skip-if = verify, does that really count?
  3. it might be skip-if = os == win7, but we don't run this test on win7 (or whatever OS it is skipped on)
  4. when you add variants, some tests can run 80 times on a given push, skipping 1 specific config counts as skipped whereas another test can say skip-if = linux which could be skipping up to 30 configs.
  5. visualizing all of the configs + variants is difficult

I would like to propose something more interesting- maybe what we ship on !debug for linux64/windows/mac/android - that is where we have the highest risk of missing coverage and letting regressions through. I would be ok including debug, but asan, tsan, ccov, mingw32, and all the variants- we could provide that information in addition, but we need some way to really measure the risk.

An alternative I can think of is: "test runs on X tier1/2 configs in CI (not with --full on try), it is skipped on Y of them", then you have a 1-y/x=%skipped, which can be used to summarize all tests and put them in buckets, i.e. 389 tests, 5 are skipped <5% of configs, 2 are skipped 5-10%, and 3 are skipped >10% (or disabled)

we still have issues outlined in the numbers above, but that gives everyone more actionable results.

jmaher avatar Jan 11 '22 14:01 jmaher