nix icon indicating copy to clipboard operation
nix copied to clipboard

Test environment report

Open roberth opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe.

Nix runs in many environments, each of which needs testing. Not all tests run in all environments. Any non-trivial logic in test code can contain bugs, which will go unnoticed. Test conditions that control whether a test runs in particular are fail unsafe. (E.g. if system == "x86-64_linux" .... Did you spot it?)

Given the source of any test, I want to be able to tell reliably,

  • Whether it runs at all. If not, CI should fail.
  • In which environments/situations it is run.

Describe the solution you'd like

  • Make all tests report whether they've run (esp. in case of the Nix package), and in which situations they've run in case of a test matrix.

  • Write a derivation that aggregates all the test reports from all our builds (packages, checks, hydraJobs, etc)

    • Make it so that certain "feasible" subsets can be reported, e.g. reasonably quick tests on x86_64-linux with kvm.
  • Make all derivations report which tests exists. Join this to the test run reports. If not a single test run report is found, fail.

    • The logic for finding tests that exist should be dead simple, such as grepping the source.
    • Also check the converse. If a test is run but not found by the simple logic, it has a bug.

Describe alternatives you've considered

Just hope.

Use an existing solution? Didn't find one. Most of our test code is custom, so I have doubts, but if anyone knows anything, please comment. JUnit's report format is a standard that's exported by a number of frameworks. Would be nice to reuse that perhaps. Maybe combine this with a cross referencing solution? Something like GHC Notes, or something that formalizes "grep across files with identifiers". Combine with coverage report? Combine the coverage reports' per-line info? (Where we have coverage reporting; something else elsewhere)

Additional context

Priorities

Add :+1: to issues you find important.

roberth avatar Apr 19 '24 23:04 roberth

Did a small amount of research:

Functional tests

Performance matters, so we want to log efficiently.

  • Have an environment variable that controls the report location
  • If set, define functions that log
  • If not set, define functions that are no-ops

When logging is enabled, do not create child processes (ie use echo/printf). Do not reopen the report file, but use a file descriptor. Set CLOEXEC, e.g.: exec 3>&-.

Report format

JUnit XML looks suitable. Open questions:

  • Nested testsuites or flattened with .? This is mostly isomorphic, so it probably depends on the tooling we use, and even then conversion is possible, esp. flattening which is lossless. This suggests nested should be the default layout, and we could preprocessing before certain tooling needs it, but maybe that's not good DX.
  • Single file or multiple?
  • Use properties for representing the test environment (from the test matrix)? Alternatively, we could have a massive hierarchy in the test suite names, but that does not seem conducive to analysis, because it doesn't convey the kind of equivalence that a duplicated test case with different properties has. Seems like properties is better, but we may have to revisit this.

Tooling

  • https://github.com/NixOS/nixpkgs/pull/319020
    • Allows filtering. Filtering by test name works fine, but filtering by skipped tests doesn't hide the many irrelevant suites.
  • TODO automated checks? A goal in this issue is to verify that each test case that exists is executed in at least one environment. I wouldn't be surprised if this still has to be custom.

roberth avatar Jun 16 '24 14:06 roberth