nextest icon indicating copy to clipboard operation
nextest copied to clipboard

Group related tests

Open TomPridham opened this issue 2 years ago • 8 comments

first off, really like this so far. would it be possible to group related tests? right now tests are output in the order the run(i assume), which means that they can get printed out of order in relation to their mod. it would be helpful to me to have all tests in a mod grouped together after they have finished running to make comparing times and looking at failures easier. current behavior

 Starting 3 tests across 1 binaries
        PASS [   0.026s]                foo bar::whatever
        PASS [   0.036s]                foo baz::what
        PASS [   0.039s]                foo bar::something_else
     Summary [   0.300s] 3 tests run: 3 passed, 0 skipped

desired behavior

 Starting 3 tests across 1 binaries
        PASS [   0.026s]                foo bar::whatever
        PASS [   0.039s]                foo bar::something_else
        PASS [   0.036s]                foo baz::what
     Summary [   0.300s] 3 tests run: 3 passed, 0 skipped

TomPridham avatar Feb 15 '22 16:02 TomPridham

Thanks for the feature request! Tests are currently output in the order they finish. I think it's totally reasonable to request that they be output in the order they were started in (which is basically lexicographic order), though that would involve some buffering so it would be a non-default option.

Would you like to implement it? It should be maybe a day's worth of work to do in the TestReporter.

sunshowers avatar Feb 15 '22 18:02 sunshowers

sure. what would that look like? i can see i would need to add the option in the cli and then in the builder. would i then just push events to a vec in report_event and then group everything and call write_event once a RunFinished event comes through?

TomPridham avatar Feb 15 '22 20:02 TomPridham

Thanks for offering to help!

sure. what would that look like? i can see i would need to add the option in the cli and then in the builder. would i then just push events to a vec in report_event and then group everything and call write_event once a RunFinished event comes through?

No, I wouldn't buffer the entire test run. Instead, I'd buffer exactly as much as necessary.

Instead, here's what I would do:

  • In the TestStarted event handler, add the newly started test to an ordered running_tests data structure.
  • In the TestFinished event handler, peek at the first entry of running_tests.
    • If it is not the same as the first test in running_tests, add it to a finished_tests data structure.
    • If it is the same, print out that test, and also go through running_tests, printing out statuses for all the other tests that are waiting to be finished in finished_tests (and removing them from finished_tests in the process).

I believe the right structure for

  • running_tests is as a double-ended queue -- for TestStarted, add elements to the end of the queue, and for TestFinished, pop elements from the start of the queue
  • finished_tests is a simple HashMap<TestInstance, ExecutionStatuses>.

Hope this helps!

sunshowers avatar Feb 15 '22 20:02 sunshowers

BTW I wouldn't make this a config option, just a CLI flag: --status-order finished/started. (finished is the current default, started is what you're asking for)

sunshowers avatar Feb 15 '22 20:02 sunshowers

Thinking about this more, I realized a couple of things:

  • I'm not sure how this would interact with failing-fast (cancel the run as soon as the first failure happens). The failing test should be printed before the canceling message, which would break the abstraction a bit.
  • I'm also quite worried about how unbounded the buffer can get, since stdout/stderr for every test run will have to be stored in memory.

Given all the extra complexity here, I'm leaning towards wontfix for now. At the very least I'd like to see good answers for these questions as well as a solid strategy for testing this new feature.

sunshowers avatar Feb 15 '22 22:02 sunshowers

would making this option and fail-fast mutually exclusive( similar to no_capture and the fail/success_output options) work for the first one? how big do you think the buffer could potentially get? and is the concern that nextest might OOM when paired with this option?

TomPridham avatar Feb 15 '22 23:02 TomPridham

Thanks for the response! As far as OOMing goes, applying backpressure is quite hard here because as you can see in https://github.com/nextest-rs/nextest/blob/421dec9d4da31f41f746172eaded3d09dc184364/nextest-runner/src/runner.rs#L233-L291, we select on both the events coming from the test run and from the signal handler (important for handling Ctrl-C properly). Blocking on that end of the test run is likely not going to work very well. So a better strategy would be to stop scheduling tests once the buffer fills up, and figure out a way for the reporter to tell the runner to start scheduling them again.

Since this is a non-default option we can try and do an unbounded strategy for now and revisit this if we run into practical issues.

I think making this option and fail-fast mutually exclusive is... ok. My main issue with it is that they're not really logically related the way --no-capture and --success/failure-output are. Ideally what we should be able to do is let the reporter know we're about to cancel the test run, and the reporter then dumps out all buffered successes and the first failure, then switches to unbuffered mode afterwards.

I would really like to see a solid testing strategy for this feature. There's several complicated state transitions involved here so it would be necessary to have good tests for it.

sunshowers avatar Feb 17 '22 02:02 sunshowers

With the switch to async code and futures, the landscape on this changes quite a bit.

We call buffer_unordered over here which controls the amount of parallelism:

https://github.com/nextest-rs/nextest/blob/c2bd1b6f1b54990de43a9378f12ae6aa04f424ba/nextest-runner/src/runner.rs#L291

We could replace this with buffered if we want. However, I'm really worried about a very slow test blocking output for all subsequent tests. So ideally, to solve this, we'd have some sort of wrapper around the Receiver that returns tests in order if they complete within some sort of time window, otherwise returns them in any order.

sunshowers avatar Jul 22 '22 20:07 sunshowers