pytest icon indicating copy to clipboard operation
pytest copied to clipboard

pytest 8.0 sorting tests with multiple parameterization is broken

Open ShurikMen opened this issue 1 year ago • 13 comments

Сontinue #11976

The solution from #11976 did not solve the problem with sorting tests with multiple parameters

Tests

import pytest


@pytest.mark.parametrize('proto', ['serial', 'telnet', 'ssh'], scope='class')
@pytest.mark.parametrize('unit', [1, 2, 3], scope='class')
class TestA:
    def test_one(self, proto, unit):
        pass

    def test_two(self, proto, unit):
        pass

Collecting items with pytest 7.4.4

11:07 $ pytest --co
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.6, pytest-7.4.4, pluggy-1.4.0 -- /home/lolik/.cache/pypoetry/virtualenvs/pytest_strain-FQAfhJsh-py3.11/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.11.6', 'Platform': 'Linux-6.7.4-arch1-1-x86_64-with-glibc2.39', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.4.0'}, 'Plugins': {'metadata': '3.1.0'}, 'GIT_BRANCH': 'master'}
rootdir: /home/lolik/Projects/straing/pytest
configfile: pytest.ini
plugins: metadata-3.1.0
collected 18 items                                                                                                                                                        

<Package tests>
  <Module test_asd.py>
    <Class TestA>
      <Function test_one[1-serial]>
      <Function test_two[1-serial]>
      <Function test_one[1-telnet]>
      <Function test_two[1-telnet]>
      <Function test_one[1-ssh]>
      <Function test_two[1-ssh]>
      <Function test_one[2-serial]>
      <Function test_two[2-serial]>
      <Function test_one[2-telnet]>
      <Function test_two[2-telnet]>
      <Function test_one[2-ssh]>
      <Function test_two[2-ssh]>
      <Function test_one[3-serial]>
      <Function test_two[3-serial]>
      <Function test_one[3-telnet]>
      <Function test_two[3-telnet]>
      <Function test_one[3-ssh]>
      <Function test_two[3-ssh]>

======================================================================= 18 tests collected in 0.01s =======================================================================

Collecting items with pytest 8.0.1

11:08 $ pytest --co
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.6, pytest-8.0.1, pluggy-1.4.0 -- /home/lolik/.cache/pypoetry/virtualenvs/pytest_strain-FQAfhJsh-py3.11/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.11.6', 'Platform': 'Linux-6.7.4-arch1-1-x86_64-with-glibc2.39', 'Packages': {'pytest': '8.0.1', 'pluggy': '1.4.0'}, 'Plugins': {'metadata': '3.1.0'}, 'GIT_BRANCH': 'master'}
rootdir: /home/lolik/Projects/straing/pytest
configfile: pytest.ini
plugins: metadata-3.1.0
collected 18 items                                                                                                                                                        

<Dir pytest>
  <Package tests>
    <Module test_asd.py>
      <Class TestA>
        <Function test_one[1-serial]>
        <Function test_two[1-serial]>
        <Function test_one[2-serial]>
        <Function test_two[2-serial]>
        <Function test_one[2-telnet]>
        <Function test_two[2-telnet]>
        <Function test_one[1-telnet]>
        <Function test_two[1-telnet]>
        <Function test_one[3-telnet]>
        <Function test_two[3-telnet]>
        <Function test_one[3-serial]>
        <Function test_two[3-serial]>
        <Function test_one[3-ssh]>
        <Function test_two[3-ssh]>
        <Function test_one[2-ssh]>
        <Function test_two[2-ssh]>
        <Function test_one[1-ssh]>
        <Function test_two[1-ssh]>

======================================================================= 18 tests collected in 0.01s =======================================================================

ShurikMen avatar Feb 19 '24 04:02 ShurikMen

Thanks, this does look wrong at first glance, I will look into it. Bisected to 09b78737a5bde23eb488d5d6649bf966c5809176 (PR #11220).

bluetech avatar Feb 19 '24 06:02 bluetech

Minimized example:

@pytest.mark.parametrize('proto', ['a', 'b'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class Test:
    def test(self, proto, unit):
        pass

The items are reordered by reorder_items() in fixture.py (surely the most inscrutable function in all of pytest), whose aim is to minimize fixture setups and teardowns. In this respect it achieves its goal:

Setup plan in pytest 7, has 6 setups/teardowns:

      SETUP    C proto['a']
      SETUP    C unit[1]
        x.py::Test::test[1-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[1-b] (fixtures used: proto, unit)
      TEARDOWN C proto['b']
      SETUP    C proto['a']
      TEARDOWN C unit[1]
      SETUP    C unit[2]
        x.py::Test::test[2-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[2-b] (fixtures used: proto, unit)
      TEARDOWN C unit[2]
      TEARDOWN C proto['b']

in pytest 8, has 5 setups/teardowns:

      SETUP    C proto['a']
      SETUP    C unit[1]
        x.py::Test::test[1-a] (fixtures used: proto, unit)
      TEARDOWN C unit[1]
      SETUP    C unit[2]
        x.py::Test::test[2-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[2-b] (fixtures used: proto, unit)
      TEARDOWN C unit[2]
      SETUP    C unit[1]
        x.py::Test::test[1-b] (fixtures used: proto, unit)
      TEARDOWN C unit[1]
      TEARDOWN C proto['b']

The way @pytest.mark.parametrize works behind the scenes is that it basically desugars to this:

import pytest

@pytest.fixture(params=[1, 2], scope='class')
def unit(request):
    return request.param

@pytest.fixture(params=['a', 'b'], scope='class')
def proto(request):
    return request.param

class Test:
    def test(self, unit, proto):
        pass

In this framing, it is clearer why we want to minimize the setups/teardowns -- the fixtures can do real work, not just return the param. Both pytest 7 and 8 reorder this version to have 5 setups/teardowns.

I can't say why before 09b78737a5bde23eb488d5d6649bf966c5809176 the reordering didn't happen, and if this was intentional or accidental. Will look into it more.

bluetech avatar Feb 20 '24 21:02 bluetech

@bluetech i believe the change by @sadra-barikbin fixed a bug in pytest wrt scope ordering

it dates back to #519 and the fix corrects the test scope ordering based on values and fixture values and scopes

for further validation we might want to add a variant of the example for #519 that mixes class scope in (in which case the old order is actually correct

but the basic gist to my current understanding is that pytest no longer has ordering differences between sugared and de-sugared parameterize as now parameterize is expressed in pseudo fixtures all the way

RonnyPfannschmidt avatar Feb 21 '24 11:02 RonnyPfannschmidt

Technical details of the change:

Before 09b78737a5bde23eb488d5d6649bf966c5809176, Metafunc will generate CallSpec's with separate params (indirect parametrizations, i.e. through fixtures) and funcargs (direct parametrizations), and the desugaring of direct to indirect (i.e.g funcargs to params) happened as a separate step after pytest_generate_functions: https://github.com/pytest-dev/pytest/blob/7.4.4/src/_pytest/fixtures.py#L156.

After 09b78737a5bde23eb488d5d6649bf966c5809176, Metafunc handles the desugaring itself, and there is no more funcargs.

All of this happens before reorder_items anyway, so why does it affect the item ordering? The difference is the param_index that the desugaring assigns to the (desguared) direct params. Before the code was this:

https://github.com/pytest-dev/pytest/blob/7.4.4/src/_pytest/fixtures.py#L168-L181

The important bit is callspec.indices[argname] = len(arg2params_list) - this basically assigns a fresh param index across all callspecs. This results in the following arg keys in reorder_items:

('proto', 0, <Function test[1-a]>)
('proto', 1, <Function test[1-b]>)
('proto', 2, <Function test[2-a]>)
('proto', 3, <Function test[2-b]>)
('unit',  0, <Function test[1-a]>)
('unit',  1, <Function test[1-b]>)
('unit',  2, <Function test[2-a]>)
('unit',  3, <Function test[2-b]>)

After, there is no special handling of the param indexes of direct params, they are handled same as parametrized fixtures. This results in the following arg keys:

('proto', 0, <Function test[1-a]>)
('proto', 0, <Function test[2-a]>)
('proto', 1, <Function test[1-b]>)
('proto', 1, <Function test[2-b]>)
('unit',  0, <Function test[1-a]>)
('unit',  0, <Function test[1-b]>)
('unit',  1, <Function test[2-a]>)
('unit',  1, <Function test[2-b]>)

Rephrasing the above in a way that may be clearer:

Before, indexes for direct params were assigned in a sequential manner per argname after all parametrizations are exploded, so we have a table of argname, item (= callspec) and we assign the param index:

argname item param index
proto test[1-a] 0
proto test[1-b] 1
proto test[2-a] 2
proto test[2-b] 3
unit test[1-a] 0
unit test[1-b] 1
unit test[2-a] 2
unit test[2-b] 3

After, the param indexes are assigned as they would for the desugaring I gave above:

import pytest

# param indexes:        0  1
@pytest.fixture(params=[1, 2], scope='class')
def unit(request):
    return request.param

# param indexes:         0    1
@pytest.fixture(params=['a', 'b'], scope='class')
def proto(request):
    return request.param

bluetech avatar Feb 22 '24 07:02 bluetech

@ShurikMen I wonder, is the example you gave realistic or do you use indirect params maybe? I'm mainly curious why you're setting scope='class' on your parametrizes when the parameters are simple values. With function scope the ordering is as you expect.

bluetech avatar Feb 22 '24 08:02 bluetech

@ShurikMen I wonder, is the example you gave realistic or do you use indirect params maybe? I'm mainly curious why you're setting scope='class' on your parametrizes when the parameters are simple values.

This is one of the simplest examples that are actually used in my projects. There are many examples with an even larger set of parameters including "mixed" scopes (class/function).

With function scope the ordering is as you expect.

Not quite like that. Changing the order causes unwanted fixture calls. They are very expensive in terms of execution time. It is important for me that tests with the same set of parameters are called sequentially.

With scope class (same optimal):

import pytest


@pytest.fixture(autouse=True, scope='class')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass
test_asd.py::TestA::test_one[1-serial] 
      SETUP    C unit[1]
      SETUP    C proto['serial']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[1-serial] 
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[1-telnet] 
      TEARDOWN C some_fix
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[1-telnet] 
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[2-serial] 
      TEARDOWN C some_fix
      TEARDOWN C unit[1]
      SETUP    C unit[2]
      TEARDOWN C proto['telnet']
      SETUP    C proto['serial']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[2-serial] 
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[2-telnet] 
      TEARDOWN C some_fix
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[2-telnet] 
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
      TEARDOWN C some_fix
      TEARDOWN C proto['telnet']
      TEARDOWN C unit[2]

With scope function

import pytest


@pytest.fixture(autouse=True, scope='function')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='function')
@pytest.mark.parametrize('unit', [1, 2], scope='function')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass

test_asd.py::TestA::test_one[1-serial] 
        SETUP    F unit[1]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_one[1-telnet] 
        SETUP    F unit[1]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_one[2-serial] 
        SETUP    F unit[2]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_one[2-telnet] 
        SETUP    F unit[2]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_two[1-serial] 
        SETUP    F unit[1]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_two[1-telnet] 
        SETUP    F unit[1]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_two[2-serial] 
        SETUP    F unit[2]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_two[2-telnet] 
        SETUP    F unit[2]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[2]

Class params with function fixture

import pytest


@pytest.fixture(autouse=True, scope='function')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass

test_asd.py::TestA::test_one[1-serial] 
      SETUP    C unit[1]
      SETUP    C proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[1-serial] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[1-telnet] 
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[1-telnet] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[2-serial] 
      TEARDOWN C unit[1]
      SETUP    C unit[2]
      TEARDOWN C proto['telnet']
      SETUP    C proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[2-serial] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[2-telnet] 
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[2-telnet] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
      TEARDOWN C proto['telnet']
      TEARDOWN C unit[2]

Here I experimented a lot with various combinations of scope parameters and scope fixtures, including parameterization in fixtures (pytest.fixture(param=...)) to obtain the most optimal ways to perform fixtures and tests.

ShurikMen avatar Feb 22 '24 08:02 ShurikMen

Currently when parameterize is taken for consideration, compound dependent fixtures are not

Id recommend having a single parameterset that creates the correct compound parameters to a single fixture so it will no longer be considered as independent parameters

RonnyPfannschmidt avatar Feb 22 '24 09:02 RonnyPfannschmidt

@bluetech take a look at #12082

ShurikMen avatar Mar 06 '24 08:03 ShurikMen

Here are the three approaches: pytest-reorder

sadra-barikbin avatar Mar 22 '24 14:03 sadra-barikbin

Seems we have an appearance-efficiency trade-off, with @ShurikMen's suggestion and v7.4.4 approach yielding better appearance but having lower efficiency in setup-teardowns in comparison with v8.0.0.

The three approaches could also be compared in terms of robustness to parametrization varieties which @ShurikMen 's suggestion for example performs better than v7.4.4 and solves the first example of #11257 but fails on the example below which v8.0.0 solves, unless the user be cautious and swap the order of parametrizations on test1.

# Here `test1["b",0]` and `test2[3]` would wrongfully have a common fixturekey in @ShurikMen 's method.
@pytest.mark.parametrize("arg2", [0, 1, 2], scope='module')
@pytest.mark.parametrize("arg1", ["a", "b"], scope='module')
def test1(arg1, arg2):
    pass

@pytest.mark.parametrize("arg2", [0, 1, 2, 3], scope='module')
def test2(arg2):
    pass

Considering robustness alone, @ShurikMen method's gain over v7.4.4 seems remarkable but I'm not sure about that of v8.0.0 over @ShurikMen 's.

sadra-barikbin avatar Mar 23 '24 23:03 sadra-barikbin

Aside by the comments above,I'm for the notion of bug for current reordering in v8.0.0 as it has not been introduced by the intention to become more efficient which we discussed here about. It either has been an unnoticed side-effect of #11220 or something in my large initial PR, responsible for the first example of #11257 (which @ShurikMen 's method now solves as well). Sorry for inconvenience, mates!

sadra-barikbin avatar Mar 24 '24 00:03 sadra-barikbin

I have updated the pr. Unified indexing of parameters added through a marker and through fixtures. I think it's the right thing to do. Along the way, I corrected the tests related to parameterization through fixtures.

ShurikMen avatar Mar 24 '24 07:03 ShurikMen

@bluetech what about solving the problem on this issue?

ShurikMen avatar May 07 '24 03:05 ShurikMen