DendroPy icon indicating copy to clipboard operation
DendroPy copied to clipboard

Is there a recommended way to run the tests in parallel?

Open mtholder opened this issue 10 years ago • 4 comments

the test suite is awe-inspiring... but obviously it takes awhile to run. Do you have any tricks for running it in parallel?

I just played around with using concurrencytest on a concurrenttest branch. I did not submit a pull request, as I'm not confident that this is the optimal solution (that package does not seem to be updated often, though perhaps that is not a problem). I've only tested on python 2.7.9 (on ubuntu)

It did improve the runtime of:

time python dendropy/test/__main__.py

from:

real    5m26.438s
user    5m14.988s

to:

real  2m30.712s
user  0m0.724s

mtholder avatar Jun 05 '15 09:06 mtholder

PS: I'm not sure what is going on with the huge diff in user. I'd guess that the tests are largely IO bound, but I don't know why that doesn't show up in the single-threaded version's accounting. And 0.724 seems too low for the concurrent version.

mtholder avatar Jun 05 '15 09:06 mtholder

Hi Mark, that's a good idea. I've tinkered around with trying to break the tests into components, but as with anything that requires on-going curation on a constantly changing base, it quickly lapsed into obsolescence (artifacts remain in the "@<TestGroup>" constructs as, unfortunately, are still documented in the README).

The thing about the tests are, that I think they can be organized in a much smarter way. There are so many tests that are reported as a single test but actually run several variants in a loop. In many cases they are useful variants, but in others the variants are just a brute force/dumb way of checking things that, e.g., the instruction to preserve underscores in a Newick source is maintained, by running through cases that really should be invariant with respect to the test, e.g. when reading from a file/stream, file path, or a data string. This is cruelly and callously accelerating the heat death of the universe due to brutish design. I think the answer to a lot of this is re-architecturing most if not all the tests around using mock fixtures, now the mock module is part of the standard library. I thought, though, that I may want to make the jump the 4 and finalize 4.0.0 first, and address bring the tests into a smarter and kinder-to-the-universe footing.

So now, I just run the tests using "python -m unittest -fc", which exits on the first fail or keyboard interrupt and reports results, or "python setup.py test > ~/Scratch/fails.txt", and watch funny animal videos on youtube or something. Obviously, parallelizing the test execution is a much better approach ...

jeetsukumaran avatar Jun 05 '15 13:06 jeetsukumaran

p.s. Did you commit the changes to this branch? Not seeing anything in the diff.

jeetsukumaran avatar Jun 05 '15 13:06 jeetsukumaran

doh! fixed.

mtholder avatar Jun 05 '15 14:06 mtholder

I fixed some shared state bugs and the tests should now run in parallel fine. I would recommend using the pytest-xdist https://pypi.org/project/pytest-xdist/ plugin for pytest.

Then, the tests can be distributed across four processes by

python3 -m pytest -n 4 tests/

mmore500 avatar Oct 03 '23 23:10 mmore500