pytest-split Splits invalid when collection order not deterministic

The big assumption underlying the two splitting algorithms is that the order of collected items is constant. However, I've come across a case where this assumption was violated. In my case I had a test parametrised with pytest.mark.parametrize, but the items to parametrize with would sometimes change order.

Take this example:

import pytest

@pytest.mark.parametrize('name', set(['henk', 'ingrid']))
def test_hello(name):
    pass

If you run this often enough you'll see that the order changes:

[2021-06-17 22:47:10] test_temp.py::test_hello[henk] PASSED                                                                                                                [ 50%]
[2021-06-17 22:47:10] test_temp.py::test_hello[ingrid] PASSED                                                                                                              [100%]

and

[2021-06-17 22:47:10] test_temp.py::test_hello[ingrid] PASSED                                                                                                              [ 50%]
[2021-06-17 22:47:10] test_temp.py::test_hello[henk] PASSED                                                                                                                [100%]

I'm not sure how to address this, but I think there are a few options:

not splitting over different values of parametrize for the same test. In other words, make sure that a single group will run all tests for test_hello.
try to create some deterministic order out of test cases by sorting. I'm not sure this will work in all cases tho (for example it might not work for objects)
do splitting on one machine, save the splits and just call pytest with those pre-calculated groups (so not really using this plugin as a plugin :p)

Jun 17 '21 20:06 mbkroese

I assume it's not deterministic here because of set, or can you repro it also with list or tuple?

Jun 20 '21 16:06 jerry-git

No, this problem occurs when either the data structure has non-deterministic order or the code generating the parametrised test cases is for some reason not deterministic.

Jun 20 '21 17:06 mbkroese

I think we could go with 1. aka make sure the tests inside same parametrize are run in the same group. However, the downside is ofc that if one parametrised test is very time consuming vs the rest of the suite, the splits would not be great.

OTOH, maybe it's better to make sure that we don't accidentally skip tests (or run some test in multiple groups) 🤔

With these thoughts, I'd go with 1. 🙂

Jun 21 '21 08:06 jerry-git

downside is ofc that if one parametrised test is very time consuming vs the rest of the suite, the splits would not be great.

Yes, and I wonder if we should perhaps be safe by default (i.e. option 1) and allow users to do the unsafe thing (existing behaviour). If we make clear what the tradeoffs are, the user can then decide for him/herself. In other words add a parameter that --split-level=func by default but can be set to --split-level=parametrized or --split-level=file?

Jun 22 '21 19:06 mbkroese