elixir icon indicating copy to clipboard operation
elixir copied to clipboard

[ExUnit] Execute only one test with the line filter, per file with multiple test modules

Open studzien opened this issue 3 years ago • 5 comments

Intro

This PR is the continuation of #11863. This PR changes the behavior of ExUnit's filter. It makes sure that only the test referenced in the filter is executed in case more than one test module is defined in one test file (see the example below). Defining multiple test modules in one file is handy when parallelizing tests aggressively, such as defining each test as a separate async: true test module.

Example

Given the following test module:

defmodule MultipleTestModulesTest do
  defmodule TestModule1 do
    use ExUnit.Case

    test "one" do
      assert "one" == 1
    end
  end

  defmodule TestModule2 do
    use ExUnit.Case

    test "two" do
      assert "two" == 2
    end
  end
end

Currently, in main, when we run the test with the line filter mix test test/multiple_test_modules_in_file.exs:14, both tests will be executed. The tests in a file are filtered on a module by module basis -- the first test from every module that starts before the line 14 is executed:

% mix test test/multiple_test_modules_in_file.exs:14
Excluding tags: [:test]
Including tags: [line: "14"]



  1) test two (MultipleTestModulesTest.TestModule2)
     test/multiple_test_modules_in_file.exs:13
     Assertion with == failed
     code:  assert "two" == 2
     left:  "two"
     right: 2
     stacktrace:
       test/multiple_test_modules_in_file.exs:14: (test)



  2) test one (MultipleTestModulesTest.TestModule1)
     test/multiple_test_modules_in_file.exs:5
     Assertion with == failed
     code:  assert "one" == 1
     left:  "one"
     right: 1
     stacktrace:
       test/multiple_test_modules_in_file.exs:6: (test)


Finished in 0.02 seconds (0.00s async, 0.02s sync)
2 tests, 2 failures

The changes proposed in this PR consider both the first and last line of every test and make sure that only the test is executed if the filter is in that range:

% mix test test/multiple_test_modules_in_file.exs:14
Excluding tags: [:test]
Including tags: [line: "14"]



  1) test two (MultipleTestModulesTest.TestModule2)
     test/multiple_test_modules_in_file.exs:13
     Assertion with == failed
     code:  assert "two" == 2
     left:  "two"
     right: 2
     stacktrace:
       test/multiple_test_modules_in_file.exs:14: (test)


Finished in 0.02 seconds (0.00s async, 0.02s sync)
2 tests, 1 failure, 1 excluded

Considerations

  • Mapping the test modules to files they are implemented is executed once for every portion of async tests and once for sync tests. Storing the test module -> file mapping in the ExUnit.Server's state can solve this;
  • Since async tests start to run when not all the test modules are loaded, I'm wondering if there's a possibility of any race conditions here. This should not be a problem if the test modules from a single file are loaded in order, but I'm not sure if that's guaranteed.

studzien avatar Jun 28 '22 13:06 studzien

Hi @studzien! I just read the blog post (awsome job!) and I am wondering if a better solution to this problem is for us to introduce:

async: :per_module | :per_test | true | false 

(where true is equivalent to :per_module).

This means the original root issue is no longer that relevant to you and that you no longer need one module per test, but one per file, and that should compile much faster too. Would you like to work on such PR?

josevalim avatar Jul 07 '22 08:07 josevalim

Hi @studzien! I just read the blog post (awsome job!) and I am wondering if a better solution to this problem is for us to introduce:

async: :per_module | :per_test | true | false 

(where true is equivalent to :per_module).

This means the original root issue is no longer that relevant to you and that you no longer need one module per test, but one per file, and that should compile much faster too. Would you like to work on such PR?

Yup, sounds good! Sure!

studzien avatar Jul 07 '22 08:07 studzien

The only thing I am not sure is what to do with the max_cases configuration. But I think we can say that it applies to both numbers of modules and number of tests. This means that you may have max_cases * max_cases tests running at the same time but we can add more fine-grained controls later.

josevalim avatar Jul 07 '22 09:07 josevalim

@josevalim I've been looking into a possible implementation for this. I see two paths in the current runner.ex code:

  1. For async: :per_test, replace run_tests/3 with a function similar to async_loop/3 (but for spawning tests instead). This will result in the potential max_cases * max_cases scenario.

  2. Make the process that was spawned for the module (in spawn_modules/3) it's own little server that suspends once the setup_all is performed. It should then somehow push all the tests that need to be executed for the module back into the async_loop (maybe a new parameter suspended that, when non-empty, takes precedence over fetching new async modules?). async_loop will then perform work according to the max_cases parameters by sending tests to the suspended module process. With this approach, no more tests than max_cases will be executed. I think this would result in the least surprising behaviour, but the implementation is a bit more hands-on.

Which one would be preferred? (or an alternative approach)

drtheuns avatar Oct 13 '22 15:10 drtheuns

I like suggestion 2. Maybe we could push {pid, ref} to ExUnit.Server and then the runner, once it sees such "case", it sends a message to it so it runs next.

josevalim avatar Oct 13 '22 17:10 josevalim

I made some progress on this, but feel a little blocked in the implementation currently. Code here: https://github.com/elixir-lang/elixir/compare/main...drtheuns:elixir:main

I've set it up as follows:

When async: :per_test is set, the module test process will block in run_module with it's own little async_module_loop. It will spawn a process for each test that will wait until it's told to execute by async_loop.

However, I've run into the following problem: the async_loop may be finished with processing the async modules faster than the module process can push the tests into ExUnit.Server. As a result, the async_loop will block until the async modules are finished (the true -> branch), but it never had a chance to fetch the running tests from the server. I feel a bit apprehensive about making spawn_modules blocking, because I don't want to slow down the existing async: true tests, especially if a module has a slow setup_all.

I've thought about introducing an await_modules into the async_loop that holds a list of :per_test refs that haven't notified the loop that they're done pushing tests onto ExUnit.Server. This way, the async_loop has a bit more control over the blocking/awaiting.

Another issue i've though about, is with the max_cases. If the async_loop has max_cases = 10, and spawns 10 module processes before it's able to see any :per_test processes, then it'll basically be stuck; it'll block until a module is done, but a module will never be done because the tests aren't being executed. So perhaps the :per_test module shouldn't count in the max_cases check?

drtheuns avatar Oct 22 '22 21:10 drtheuns

What if we have been approaching this issue the wrong way?

Maybe, instead of swapping it per module, maybe we should swap the whole suite to run tests async via a flag? This way we don't need to consider the interplay of those two options and it should simplify the implementation. In both cases, the concurrency is controlled with max_cases.

WDYT?

josevalim avatar Oct 23 '22 08:10 josevalim

Do you mean something like?

ExUnit.configure(async_mode: :per_test | :per_module)

which would then determine how the runner will treat async: true code?

In the :per_module case, we can just keep the existing implementation.

For the :per_test case we could then ignore the module processes from the max_cases, because other than the setup_all they're not actually executing tests. Perhaps we could still limit the amount of async modules that have been spawned, but at least they won't block the execution of the actual tests.

Is this what you had in mind?

drtheuns avatar Oct 23 '22 10:10 drtheuns

Yes, that's what I had in mind. :)

josevalim avatar Oct 23 '22 10:10 josevalim

Closing this one for now, due to lack of activity. Thank you.

josevalim avatar Jan 17 '24 08:01 josevalim