testfx icon indicating copy to clipboard operation
testfx copied to clipboard

Support tests sharding in dotnet test

Open Evangelink opened this issue 1 year ago • 6 comments

Is your feature request related to a problem? Please describe.

We are trying to reduce the time of our CI by splitting the execution between multiple runners.

Describe the solution you'd like

Similar to what we can do in Playwright, I think that implementing sharding in the dotnet test CLI would greatly benefit many projects. One possible API could be based on the previous example:

# The first runner could execute this, running the first fifth of the tests in MyTests.dll
dotnet test "MyTests.dll" --shard 1/5
# Run the second fifth of the tests in MyTests.dll
dotnet test "MyTests.dll" --shard 2/5
# ...
dotnet test "MyTests.dll" --shard 3/5
dotnet test "MyTests.dll" --shard 4/5
dotnet test "MyTests.dll" --shard 5/5

Additional context

We already tried these two solutions that does not completely satisfy us:

Work around the issue by splitting our tests in multiple projects

We can have multiple test projects and configure our CI to split the projects to run between multiple runners. This is not ideal as it forces an architecture upon us, multiplying the projects count for no other purpose than the tests execution.

Keep a single test project and split the tests to run manually

We also tried to implement our own "sharding" by:

  1. Listing all of the tests with the dotnet test --list-tests command;
  2. Split the result in shards in a bash script;
  3. Have one CI runner per shard.

The return of the dotnet test --list-tests command forces us to manipulate it multiple times, removing the headers that cannot be muted and manage the parameterized tests that appears multiple times. Those manipulations are tedious and seem hacky, compared to a native solution provided by the SDK.


Opened by @ThomasFerroAgicap in https://github.com/dotnet/sdk/issues/42986#issuecomment-2474730390

Evangelink avatar Nov 15 '24 12:11 Evangelink

This is why I also opened #3527 and #3528 - I was thinking a similar approach where we can work out the amount of tests and then have a filter that executes a set number of tests from the specified index.

thomhurst avatar Nov 16 '24 20:11 thomhurst

We had similar request in vstest (which I cannot find right now), but the idea there was to do discovery, and say e.g. run at max 1000 tests per assembly.

The goal was to "split" an assembly to better utilize the time and having multiple CPUs. So rather than running like this:

bigAssembly.dll |--------------------------------------------------------------------------------------------------------------------|
assembly1.dll   |-------------------| 
assembly2.dll   |-------------------| 
assembly3.dll   |-------------------| 
bigAssembly.dll |------------------------------------|
assembly1.dll   |-------------------| 
assembly2.dll   |-------------------| 
assembly3.dll   |-------------------| 
bigAssembly.dll                       |--------------------------------|
bigAssembly.dll                       |------------------------------------------------|

The implementation would then do discovery, split the tests and run them in multiple "copies" of the dll.

This should imho be pretty simple to achieve with having a client (dotnet test), even though to really make it useful, some historical data about how the tests ran would be very useful. e.g. what happens if test 4 takes 5 minutes, in which part we should put it?

IIRC roslyn does this in their CI runs via discovery, historical data and automation. At least I talked about that with David Barbet some time ago and imho it was implemented.

nohwnd avatar Nov 18 '24 09:11 nohwnd

Isn't this implementable via a custom MTP extension via an orchestrator (putting aside that it's internal only currently - https://github.com/microsoft/testfx/issues/5554 tracks opening it publicly)?

The orchestrator can launch a process that does the discovery, and reports back the discovery results (the list of node uids) with pipe communication. Then, after the orchestrator got all the uids, it can launch test hosts (whether in parallel or not), that runs given test node uids (retry orchestrator already does that)

Youssef1313 avatar May 28 '25 17:05 Youssef1313

Yes orchestrator is meant for that. It's currently not entirely polished that's why it wasn't yet open.

Evangelink avatar May 29 '25 15:05 Evangelink

It's currently not entirely polished that's why it wasn't yet open

@Evangelink Anything specific needed to be able to open it?

Youssef1313 avatar May 29 '25 16:05 Youssef1313

Just mentioning, this is implemented in NUnit as a partitionfilter back in 2023, see discussion https://github.com/nunit/nunit/discussions/5059 .

OsirisTerje avatar Nov 20 '25 12:11 OsirisTerje