click icon indicating copy to clipboard operation
click copied to clipboard

WIP: Randomized and parallel tests

Open kdeldycke opened this issue 1 month ago • 11 comments

Help detection of leaks and flaky tests.

This PR is a WIP to check the effects on different platforms and contexts.

Preliminary results:

  • Multiple cores are properly detected and tests are run in parallel:

    created: 4/4 workers
    4 workers [1336 items]
    
  • Unittests fails as some are leaky.

Relates to:

  • https://github.com/pallets/click/pull/3139
  • https://github.com/pallets/click/pull/3140
  • https://github.com/pallets/click/issues/3110
  • https://github.com/pallets/click/issues/2993
  • https://github.com/pallets/click/pull/2991
  • https://github.com/pallets/click/issues/824

kdeldycke avatar Nov 19 '25 06:11 kdeldycke

Docs has the test runner as not thread safe so I am interested in what you find:

These tools should only be used for testing since they change the entire interpreter state for simplicity. They are not thread-safe! https://click.palletsprojects.com/en/stable/testing/

Rowlando13 avatar Nov 19 '25 06:11 Rowlando13

I tried to combined https://github.com/pallets/click/pull/3139 and https://github.com/pallets/click/pull/3140 but they don't fix the issue.

kdeldycke avatar Nov 19 '25 06:11 kdeldycke

Maybe https://github.com/pallets/click/pull/1572 is fixing the issue.

kdeldycke avatar Nov 19 '25 06:11 kdeldycke

I tried to combined #3139 and #3140 but they don't fix the issue.

The test for #2991/#2993 failed when I tested the #3140 branch, so that doesn't seem surprising. Does this PR with only #3139 on top of stable improve the situation compared to stable?

pjonsson avatar Nov 19 '25 08:11 pjonsson

I don't think you're doing this for performance, but I investigated pytest-xdist in a different context a couple of years ago, and at the time it was difficult to get a speedup because the startup time was linear to the number of workers.

The startup time still seems to be linear to the number of workers, but the constant is much better now, here is a log with timing for --numprocesses=: auto (16), 4, 1, and a final run that still has pytest-xdist in the dependencies but without passing --numprocesses= to pytest.

As comparison, uv run pytest on stable reports 1316 passed, 21 skipped, 1 xfailed in 1.38s on the same machine.

$ uv run pytest tests/test_compat.py
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.1, pluggy-1.6.0
Using --randomly-seed=3856693423
rootdir: /home/ubuntu/src/click
configfile: pyproject.toml
plugins: xdist-3.8.0, randomly-4.0.1
16 workers [1 item]
.                                                                        [100%]
============================== 1 passed in 0.93s ===============================
$ uv run pytest tests/test_compat.py
      Built click @ file:///home/ubuntu/src/click
Uninstalled 1 package in 0.25ms
Installed 1 package in 1ms
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.1, pluggy-1.6.0
Using --randomly-seed=2447227139
rootdir: /home/ubuntu/src/click
configfile: pyproject.toml
plugins: xdist-3.8.0, randomly-4.0.1
4 workers [1 item]
.                                                                        [100%]
============================== 1 passed in 0.47s ===============================
$ uv run pytest tests/test_compat.py
      Built click @ file:///home/ubuntu/src/click
Uninstalled 1 package in 0.30ms
Installed 1 package in 1ms
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.1, pluggy-1.6.0
Using --randomly-seed=3619139103
rootdir: /home/ubuntu/src/click
configfile: pyproject.toml
plugins: xdist-3.8.0, randomly-4.0.1
1 worker [1 item]
.                                                                        [100%]
============================== 1 passed in 0.33s ===============================
$ uv run pytest tests/test_compat.py
      Built click @ file:///home/ubuntu/src/click
Uninstalled 1 package in 0.24ms
Installed 1 package in 1ms
============================= test session starts ==============================
platform linux -- Python 3.12.3, pytest-9.0.1, pluggy-1.6.0
Using --randomly-seed=2375080509
rootdir: /home/ubuntu/src/click
configfile: pyproject.toml
plugins: xdist-3.8.0, randomly-4.0.1
collected 1 item

tests/test_compat.py .                                                   [100%]

============================== 1 passed in 0.02s ===============================

pjonsson avatar Nov 19 '25 10:11 pjonsson

Does this PR with only #3139 on top of stable improve the situation compared to stable?

No, not in the results I got. I was under the impression that running tests in parallel would uncover the issues you were discussing in #3139 and #3140 with @getzze and @neutrinoceros. But it looks completely unrelated.

Here the issues are with the pager:

  FAILED tests/test_utils.py::test_echo_via_pager[test9-cat ] - AssertionError:...
  FAILED tests/test_utils.py::test_echo_via_pager[test4-cat ] - AssertionError:...
  FAILED tests/test_utils.py::test_echo_via_pager[test0-less] - AssertionError:...
  FAILED tests/test_utils.py::test_echo_via_pager[test2- less ] - AssertionErro...
  FAILED tests/test_utils.py::test_echo_via_pager[test2- less] - AssertionError...
  FAILED tests/test_utils.py::test_echo_via_pager[test2- cat ] - AssertionError...
  FAILED tests/test_utils.py::test_echo_via_pager[test0-cat ] - AssertionError:...
  FAILED tests/test_utils.py::test_echo_via_pager[test9-less] - AssertionError:...
  FAILED tests/test_utils.py::test_echo_via_pager[test2-less] - AssertionError:...
  FAILED tests/test_utils.py::test_echo_via_pager[test8-cat ] - AssertionError:...

And the failing tests are completely random depending on the order they're executed (from 6 to 47 failing tests on my machine). That's why I was thinking that maybe a solution might be in the direction of https://github.com/pallets/click/pull/1572.

So I guess you can continue working on #3139 and #3140 independently of this PR.

kdeldycke avatar Nov 19 '25 11:11 kdeldycke

No, not in the results I got. I was under the impression that running tests in parallel would uncover the issues you were discussing in #3139 and #3140

Well, #3139 does contain a test that I took from one of the referenced tickets, and that test triggers ValueError: I/O operation on closed file reliably for me when I run pytest with the right parameters even without parallelism. I think the perceived randomness with the "file" closing issue (buffers in that case) stems from finalization order of objects, and there are usually not a whole lot of guarantees from garbage collectors when it comes to finalization.

Don't take my comment as discouragement from what you're trying to achieve here, I'm just stating that my goal with #3139 was the incredibly narrowly scoped "fix the immediate Click issue I'm suffering from". I would guess that not having multiple objects trying to close the same buffers in their finalizers would also be beneficial here, but I don't know the Click code base so take my guesses for what they are worth.

pjonsson avatar Nov 19 '25 12:11 pjonsson

And the failing tests are completely random depending on the order they're executed

sounds like a job for https://pypi.org/project/detect-test-pollution/ (just passing by, ignore me if that's not actually relevant)

neutrinoceros avatar Nov 19 '25 15:11 neutrinoceros

I'd prefer not adding these as a regular test, it introduces a lot of complexity and uncertainty. Maybe just use it to clean up tests here and then remove again.

The tests already run very fast, and I'd prefer to speed them up directly if possible rather than by using multiprocessing.

davidism avatar Nov 19 '25 16:11 davidism

Oh that's definitely not a tool you want to run regularly in CI. It's really meant to help for local debugging only.

Edit: hidden as off-topic, I thought David was talking about detect-test-pollution but on second thought that's probably not what he meant

neutrinoceros avatar Nov 19 '25 16:11 neutrinoceros

No that's fine, I was referring to all three plugins.

davidism avatar Nov 19 '25 16:11 davidism