General questions to orient possible contributions
Hi there!
Nice project, the size of one of my projects is starting to explode in complexity and now I've made it worse for myself added an MPI layer. I run into ALL kinds of issues with synchronizing tests among MPI processes using python's unittest module. I'd maybe want to swap to pytest and your plugin. Some questions:
- Does pytest run tests in a deterministic order? Any invisible threading or parallelism going on or anything that interferes with assumptions made when debugging MPI applications?
- Is there any support or API to track which tests have been run in what order? This can be important when certain MPI processes get deadlocked because of other MPI processes running other tests or being stuck in extra
Barriercalls from across test functions - How does pytest deal with MPI? Gracious error handling?
- What is your personal experience and setup to make sure that each test function is truely contained across MPI processes? (as mentioned earlier, imagine 1 process doing an extra barrier and all the other processes continuing and getting stuck elsewhere).
- Are there any timeout mechanisms to make sure tests get aborted if they run too long? (On a per test basis would be a plus)
I know it's a ton of questions! But I'm seriously considering switching and making hefty contributions to this package where MPI users across the world need them ;p I think I can contribute quite well with some experience with some other MPI tooling attempts: a pool implementation, an across-MPI locking package with some neat read/write/collective priority locking
The main purpose of pytest-mpi has been to assist with using MPI with h5py's tests (ensuring that tests that require MPI only run under MPI, those that do not work under MPI are not run under MPI etc.). It hasn't so far dealt with the more intricate parts of MPI (the most complex thing it's done is provide tempdir/tempfile fixtures that work with MPI).
As for your questions:
- As far as I know, the ordering of tests is deterministic (there's a pytest plugin which makes them random—coupling this with tox which sets a base seed should make this MPI-safe, though I have not tested that).
- I haven't encountered any invisible threading or parallelism, those also I think are pushed off to other plugins (e.g. xdist).
- Tracking tests could likely be done with a plugin (which could be part of pytest-mpi).
- I've been doing
mpirun -n <n> python -m pytest <pytest-args>, so I'm not effectively using pytest's error handling, and deadlocks are an issue (improvements in this area would be useful). Flipping this so pytest calls MPI may mean that other pytest plugins which handle timeouts etc. could be used, but how that sets up python I've not experimented with.
I've been meaning to move this to the pytest-dev organisation, so that it's more obvious how to contribute, but that's fallen off my todo list for now.
Ok cool, thanks for your response! I'll start on a test-tracker and timeout feature for deadlocks for this plugin! Seems like a useful addition.
As for
Flipping this so pytest calls MPI
I think that could be done if MPI_Spawn is available, pytest can just spawn the n desired processes running a test and get the results.
FWIW, there is another MPI testing tool out there called testflo, which only supports unittest unfortunately. But it does spawn MPI processes itself (so each test is using the specified number of processors) and handles a lot of the errors nicely. It has timeouts and some other features too like memory profiling. We have been using it successfully for some time now, but we're also evaluating other options such as pytest-mpi which could provide other features in pytest that aren't available to us at the moment.
Some of the nice-to-have features that we'd be interested in are:
- the ability for
pytest-mpito spawn the MPI processes (as done bytestflo) instead of callingpytestviampirun. As far as I can tell, this is the only way to ensure that each test is being run with its own specified number of processors - some sort of job scheduling ability instead of just spawning as many processes as needed, despite possibly oversubscribing. This may be outside the scope of these plugins though
- easy parameterization of running the same test with different number of processors
The latter two are not available with testflo yet, so we're just keeping an eye out for any developments elsewhere that could prompt a switch.
@nwu63 Cool, didn't know about testflo. I don't have much time for major development on pytest-mpi, but if you want to add support for calling tests under MPI, I'm happy to look at PRs. I suspect point 2 on your list would be quite complex, as you need to have something which runs the tests in parallel but also tracks the number of MPI processes.
I suspect if you got point 1 working, then you could reuse pytest's parametrisation framework to get 3 for free.
Yeah I agree with everything you said there @aragilar. Unfortunately I'm also quite busy these days, but I'll keep this repo in mind and perhaps contribute code in the future. Cheers.