DendroPy
DendroPy copied to clipboard
Is there a recommended way to run the tests in parallel?
the test suite is awe-inspiring... but obviously it takes awhile to run. Do you have any tricks for running it in parallel?
I just played around with using concurrencytest on a concurrenttest branch. I did not submit a pull request, as I'm not confident that this is the optimal solution (that package does not seem to be updated often, though perhaps that is not a problem). I've only tested on python 2.7.9 (on ubuntu)
It did improve the runtime of:
time python dendropy/test/__main__.py
from:
real 5m26.438s
user 5m14.988s
to:
real 2m30.712s
user 0m0.724s
PS: I'm not sure what is going on with the huge diff in user. I'd guess that the tests are largely IO bound, but I don't know why that doesn't show up in the single-threaded version's accounting. And 0.724 seems too low for the concurrent version.
Hi Mark, that's a good idea. I've tinkered around with trying to break the tests into components, but as with anything that requires on-going curation on a constantly changing base, it quickly lapsed into obsolescence (artifacts remain in the "@<TestGroup>" constructs as, unfortunately, are still documented in the README).
The thing about the tests are, that I think they can be organized in a much smarter way. There are so many tests that are reported as a single test but actually run several variants in a loop. In many cases they are useful variants, but in others the variants are just a brute force/dumb way of checking things that, e.g., the instruction to preserve underscores in a Newick source is maintained, by running through cases that really should be invariant with respect to the test, e.g. when reading from a file/stream, file path, or a data string. This is cruelly and callously accelerating the heat death of the universe due to brutish design. I think the answer to a lot of this is re-architecturing most if not all the tests around using mock fixtures, now the mock module is part of the standard library. I thought, though, that I may want to make the jump the 4 and finalize 4.0.0 first, and address bring the tests into a smarter and kinder-to-the-universe footing.
So now, I just run the tests using "python -m unittest -fc", which exits on the first fail or keyboard interrupt and reports results, or "python setup.py test > ~/Scratch/fails.txt", and watch funny animal videos on youtube or something. Obviously, parallelizing the test execution is a much better approach ...
p.s. Did you commit the changes to this branch? Not seeing anything in the diff.
doh! fixed.
I fixed some shared state bugs and the tests should now run in parallel fine. I would recommend using the pytest-xdist https://pypi.org/project/pytest-xdist/ plugin for pytest.
Then, the tests can be distributed across four processes by
python3 -m pytest -n 4 tests/